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Description 

This application is a divisional application from European Patent Application No 94909609.3 filed 10th February 
1994. 

s The present invention relates to novel alpha-amylase mutants having an amino acid sequence not found in nature, 

such mutants having an amino acid sequence wherein one or more amino acid residue(s) of a precursor alpha-amylase, 
specifically an oxidizable amino acid, have been substituted with a different amino acid. The mutant enzymes of the 
present invention exhibit altered stability/activity profiles including but not limited to altered oxidative stability, altered 
pH performance profile, altered specific activity and/or altered thermostability. In a particular embodiment the invention 

10 provides Bacillus alpha-amylases having a substitution or deletion of an amino acid at a position equivalent to M + 15 
in Bacillus licheniformis alpha-amylase and provides uses of these alpha-amylase. 

Alpha-amylases (alpha- 1 ,4-glucan-4-glucanohydrolase, EC3.2.1.1) hydrolyze internal alpha-1 ,4-glucosidic link- 
ages in starch largely at random, to produce smaller molecular weight malto<jextrins. Alpha-amylases are of consid- 
erable commercial value, being used in the initial stages (liquefaction) of starch processing; in alcohol production; as 

15 cleaning agents in detergent matrices; and in the textile industry for starch desizing. Alpha-amylases are produced by 
a wide variety of microorganisms including Bacillus and Aspergillus, with most commercial amylases being produced 
from bacterial sources such as B. lichenformis, B. amyloliquefaciens, B. subtilis, or B. strearothermophilus. In recent 
years the preferred enzymes in commercial use have been those from B. licheniformb because of their heat stability 
and performance, at least at neutral and mildly alkaline pH's. 

20 Previously there have been studies using recombinant DNA techniques to explore which residues are important 

for the catalytic activity of amylases and/or to explore the effect of modifying certain amino acids within the active site 
of various amylases (Vihinen, M. et al. (1990) J. Bichem. 107:267-272; Holm, L et al. (1990) Protein Engineering 3: 
181-191; Takase, K. et al. (1992) Biochemica et Biophysica Acta, 1120:281-288; Matsui, I. et al. (1992) Feds Letters 
Vol. 310, No. 3, pp. 216-218); which residues are important for thermal stability (Suzuki, Y et al. (1989) J. Biol. Chem. 

25 264:18933-18938); and one group has used such methods to introduce mutations at various histidine residues in a B. 
licheniformis amylase, the rationale for making substitutions at histidine residues was that B. licheniformis amylase 
(known to be thermostable) when compared to other similar Bacillus amylases, has an excess of histidines and, there- 
fore, it was suggested that replacing a histidine could affect the thermostability of the enzyme (Declerck, N. et al. (1 990) 
J. Biol. Chem. 265:15481-15488; FR 2 665 1 78-A1 ; Joyet, P. etal. (1992) Bio Technology 10:1579-1583). 

30 It has been found that alpha-amylase is inactivated by hydrogen peroxide and other oxidants at pH's between 4 

and 10.5 as described in the examples herein. Commercially, alpha-amylase enzymes can be used under dramatically 
different conditions such as both high and low pH conditions, depending on the commercial application. For example, 
alpha-amylases may be used in the liquefaction of starch, a process preferably performed at a low pH (pH < 5.5). On 
the other hand, amylases may be used in commercial dish care or laundry detergents, which often contain oxidants 

35 such as bleach or peracids, and which are used in much more alkaline conditions. 

In order to alter the stability or activity profile of amylase enzymes under varying conditions, it has been found that 
selective replacement, substitution or deletion of oxidizable amino acids, such as methionine, tryptophan, tyrosine, 
histidine or cysteine, results in an altered protile of the variant enzyme as compared to its precursor. Because currently 
commercially available amylases are not acceptable (stable) under various conditions, there is a need for an amylase 

40 having an altered stability and/or activity profile. This altered stability (oxidative, thermal or pH performance profile) 
can be achieved while maintaining adequate enzymatic activity, as compared to the wild-type or precursor enzyme. 
The characteristic affected by introducing such mutations may be a change in oxidative stability while maintaining 
thermal stability or vice versa. Accordingly, the substitution of different amino acids for an oxidizable amino acid(s) in 
the alpha-amylase precursor sequence or the deletion of one or more oxidizable amino acid(s) may result in altered 

45 enzymatic activity at a pH other than that which is considered optimal for the precursor alpha-amylase. In other words, 
the mutant enzymes of the present invention may also have altered pH performance profiles, which may be due to the 
enhanced oxidative stability of the enzyme. 

The present invention relates to novel alpha-amylase mutants that are the expression product of a mutated DNA 
sequence encoding an alpha-amylase, the mutated DNA sequence being derived from a precursor alpha-amylase by 

50 the deletion or substitution (replacement) of one or more oxidizable amino acid. In particular the invention relates to a 
mutant alpha-amylase that is the expression product of mutated DNA sequence encoding an alpha-amylase, the mu- 
tated DNA sequence being derived from a precursor alpha-amylase which is a Bacillus alpha-amylase by substitution 
or deletion of an amino acid at a position equivalent to M + 15 in Bacillus licheniformis alpha-amylase. 

In another embodiment of the present invention the mutants comprise a substitution of one or more tryptophan 

55 residues alone or in combination with the substitution of one or more methionine residues in a precursor alpha-amylase. 
Such mutant atpha-amylases, in general, are obtained by in vitro modification of a precursor DNA sequence encoding 
a naturally occurring or recombinant alpha-amylase to encode the substitution or deletion of one or more amino acid 
residues in a precursor amino acid sequence. 
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The substitution or deletion of one or more amino acids in the amino acid sequence is due to the r placement or 
deletion ot one or more methionine and/or tryptophan, residues in such sequence. These oxidizable amino acid resi- 
dues may be replaced by any of the other 20 naturally occurring amino acids. If the desired effect is to alter the stability 
of the precursor, the amino acid residue may be substituted with anon-oxidizable amino acid (such as alanine, arginine, 
s asparagine, aspartic acid, glutamic acid, glutarnine, glycine, isoleucine, leucine, lysine, phenylalanine, proline, serine, 
threonine, or valine) or another oxidizable amino acid (such as cysteine, methionine, tryptophan, tyrosine or histidine, 
listed in order of most easily oxidizable to less readily oxidizable). Likewise, if the desired effect is to alter thermostability, 
any of the other 20 naturally occurring amino acids may be substituted (i.e., cysteine may be substituted for methionine). 

The methionine to be replaced is a methionine at a position equivalent to position + 15 in B. licheniformis alpha- 
10 amylase. The preferred substitute amino acids at position + 1 5 are leucine (L), threonine (T), asparagine (N), aspartate 
(D), serine (S), valine (V) and isoleucine (I), although other substitute amino acids not specified above may be useful. 
A specifically preferred mutant of the present invention is M15L 

Another embodiment of this invention relates to mutants comprising the substitution of a tryptophan residue equiv- 
alent to any of the tryptophan residues found in B. licheniformis alpha-amylase (see Fig. 2). Preferably the tryptophan 
15 to be replaced is at a position equivalent to +138 in B. licheniformis alpha-amylase. A mutation (substitution) at a 
tryptochan residue may be made alone or in combination with mutations at other oxidizable amino acid residues. 
Specifically, it may be advantageous to modify by substitution of at least one tryptophan in combination with at least 
one methionine. 

The alpha-amylase mutants of the present invention, in general, exhibit altered oxidative stability in the presence 
20 of hydrogen peroxide and other oxidants such as bleach or peracids, or, more specific, milder oxidants such as chlo- 
ramine-T Mutant enzymes having enhanced oxidative stability will be useful in extending the shelf life and bleach, 
perborate, percarbonate or peracid compatibility of amylases used in cleaning products. Similarly, reduced oxidative 
stability may be useful in industrial processes that require the rapid and efficient quenching of enzymatic activity. The 
mutant enzymes of the present invention may also demonstrate a broadened pH performance profile whereby mutants 
25 such as M1 5L show stability for low pH starch liquefaction. The mutants of the present invention may also have altered 
thermal stability whereby the mutant may have enhanced stability at either high or low temperatures. It is understood 
that any change (increase or decrease) in the mutant's enzymatic characteristic(s), as compared to its precursor, may 
be beneficial depending on the desired end use of the mutant alpha-amylase. 

In addition to starch processing and cleaning applications, variant amylases of the present invention may be used 
30 in any application in which known amylases are used, for example, variant amylases can be used in textile processing, 
food processing, etc. Specifically, it is contemplated that a variant enzyme, 

inactivated by oxidation, would be useful in a process where it is desirable to completely remove amylase activity at 
the end of the process, for example, in frozen food processing applications. 

The preferred alpha-amylase mutants of the present invention are derived from a Bacillus strain such as B. licheni- 

35 formis, B. amyloiiquefaciens, and B. stearothermophilus, and most preferably from Bacillus licheniformis. 

In another aspect of the present invention there is provided a novel form of the alpha-amylase normally produced 
by B. licheniformis. This novel form, designated as the A4 form, has an additional four alanine residues at the N- 
terminus of the secreted amylase. (Fig. 4b.) Derivatives or mutants of the A4 form of alpha-amylase are encompassed 
within the present invention. By derivatives or mutants of the A4 form, it is meant that the present invention comprises 
the A4 form alpha-amylase containing one or more additional mutations such as, for example, mutation (substitution, 
replacement or deletion) of one or more oxidizable amino acid(s). 

In a composition embodiment of the present invention there are provided detergent compositions, liquid, gel or 
granular, comprising the alpha-amylase mutants described herein. Additionally, it is contemplated that the compositions 
of the present invention may include an alpha-amylase mutant having more than one site-specific mutation. 

45 In yet another composition embodiment of the present invention there are provided compositions useful in starch 

processing and particularly starch liquefaction. The starch liquefaction compositions of the present invention preferably 
comprise an alpha-amylase mutant having a substitution or deletion at position M15. Additionally, it is contemplated 
that such compositions may comprise additional components as known to those skilled in the art, including, for example, 
antioxidants, calcium, ions, etc. 

so in a process aspect of the present invention there are provided methods for liquefying starch, and particularly 

granular starch slurries, from either a wet or dry milled process. Generally, in the first step of the starch degradation 
process, the starch slurry is gelatinized by heating at a relatively high temperature (up to about 11 0°C). After the starch 
slurry is gelatinized it is liquefied and dextrin ized using an alpha-amylase. The conditions for such liquefaction are 
described in US patent applications 07/785,624 and 07/785, 623 and US Patent 5,180,699. The present method for 

55 liquefying starch comprises adding to a starch slurry an effective amount of an alpha-amylase of the present invention, 
alone or in combination with additional excipients such as an antioxidant, and reacting the slurry for an appropriate 
time and temperature to liquefy the starch. 

A further aspect of the present invention comprises the DNA encoding the mutant alpha-amylases of the present 
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invention (including A4 form and mutants thereof) and expression vectors encoding the DNA as well as host cells 
transformed with such expression vectors 

The invention will now be described by way of example with reference to the accompanying drawings:- 

s Fig. 1 shows the DNA sequence of the gene for alpha-amylase from B. licheniformis (INCIB8061 ), Seq ID No 31 , 

and deduced translation product as described in Gray, G et al. (1986) J. Bacter 166:635-643. 

Fig.2 shows the amino acid sequence of the mature alpha amylase enzyme from B. licheniformis (NCIB8061), 
Seq ID No 32. 

10 

Fig.3 shows an alignment of primary structures of Bacillus alpha-amylases. The B. licheniformis amylase (Am- 
Lich), Seq ID No 33, is described by Gray, G. et al. (1 986) J.Bact. 166:635-643, the B. amyloliquefaciens amylase 
(Am-Amylo), Seq ID No 34, is described by Takkinen, K. et al. (19B3) J. Biol. Chem. 285: 1007-1013; and the B 
stearothermophilus (Am-Stearo), Seq ID No 35, is described by lhara, H. et al. (1985) J. Biochem, 98:95-103. 

15 

Fig. 4 shows the amino acid sequence of the mature alpha-amylase variant M197T, Seq ID No 36. 

Fig. 4b shows the amino acid sequence of the A4 form of alpha-amylase from B. licheniformis NCIB8061 , Seq ID 
No 37. Numbering is from the N-terminus, starting with the four additional alanines. 

20 

Fig. 5 shows plasmid pA4BL wherein BLAA refers to B. licheniformis alpha-amylase gene, Pstl to Sstl; Amp R refers 
to the ampicillin-resistant gene from pBR322; and CAT refers to the Chloramphenicol-resistant gene from pC1 94, 

Fig. 6 shows the signal secuence-mature protein junctions for B. licheniformis (Seq ID No 38), B. subtilis (Seq ID 
25 No 39), a licheniformis in pA4BL (Seq ID No 40) and B. licheniformis in pBLapr (Seq ID No 41 ). 

Fig 7 shows inactivation of certain alpha-amylases (Spezyme ® AA20, M15L) with 0.88M H 2 0 2 at pH 5.0 25°C. 

Fig. 8 shows a schematic for the production of M15X cassette mutants. 

30 

Fig. 9 shows expression of M15X variants. 

Fig. 10 shows specific activity of M15X variants on soluble starch. 

35 Fig. 1 1 shows heat stability of M15X variants at 90°C, pH 5.0, 5mM CaCI 2 , 5 mins. 

Fig. 12 shows a specific activity on starch and soluble substrate, and performance in jet liquefaction at pH 5.5, of 
M 15 variants as a function of percent activity of B. licheniformis wild-type. 

40 Fig. 1 3 shows the inactivation of B. licheniformis alpha-amylase (AA20 at 0.65mg/ml) with chloramtne-T at pH 8.0 

as compared to variants M197A (1 7mg/ml) and M197L (1 .7 mg/ml). 

Fig. 14 shows the inactivation of BMcheniformis alpha-amylase (AA20 at 0.22mg/ml) with chloramine-T at pH 4.0 
as compared to variants M197A (4.3 mg/ml) and M197L 

45 

Fig. 15 shows the reaction of B. licheniformis alpha-amylase (AA20 at 0.75 mg/ml) with chloramine-T at pH 5.0 
as compared to double variants M197T/W138F (0.64 mg/ml) and M1 97T/W138Y (0.60 mg/ml). 

It is believed that amylases used in starch liquefaction may be subject to some form of inactivation due to some 
so activity present in the starch slurry (see US applications 07/785,624 and 07/785,623 and US Patent 5,180,669, issued 
January 1 9, 1 993. Furthermore, use of an amylase in the presence of oxidants, such as in bleach or peracid containing 
detergents, may result in partial or complete inactivation of the amylase. Therefore, the present invention focuses on 
altering the oxidative sensitivity of amylases. The mutant enzymes of the present invention may also have an altered 
pH profile and/or altered thermal stability which may be due to the enhanced oxidative stability of the enzyme at low 
55 or high pH's. 

Alpha-amylase as used herein includes naturally occurring amylases as well as recombinant amylases. Preferred 
amylases in the present invention are alpha-amylases derived from B. licheniformis or B. stearothermophilus, including 
the A4 form of alpha-amylase derived from S. licheniformis as described herein, as well as fungal alpha-amylases as 
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those derived from Aspergillus (i.e. as A. oryzae and A. niger). 

Recombinant alpha-amylases refers to an alpha-amylase in which the DNA sequence encoding the naturally oc- 
curring alpha-amylase is modified to produce a mutant DNA sequence which encodes the substitution, insertion or 
deletion of one or more amino acids in the alpha-amylase sequence. Suitable modification methods are disclosed 

s herein, and also in US Patents 4,760,025 and 5, 1 85,258. 

Homologies have been found between almost all endo-amylases sequenced to date, ranging from plants, mam- 
mals, and bacteria (Nakajima, R.T, etal. (1 986) Appl. Microbiol. Biotechnol. 23:355-360; Rogers, J.C. (1 985) Biochem. 
Biophys. Res. Commun. 128:470-476). There are four areas of particularly high homology in certain Bacillus amylases, 
as shown in Fig. 3, wherein the underlined sections designate the areas of high homology. Further, sequence alignments 

10 have been used to map the relationship between Bacillus endo-amylases (Feng, D.F. and Doolittle, R.F (1987) J. 
Molec. Evol. 35:351-360). The relative sequence homology between B. stearothermophilus and B. licheniformis amy- 
lase is about 66%, as determined by Holm, L et al. (1990) Protein Engineering 3 (3) pp. 181-191. The sequence 
homology between B. licheniformis and B. amyloliquefaciens amylases is about 81%, as per Holm, L et al., supra. 
While sequence homology is important, it is generally recognized that structural homology is also important in com- 

15 paring amylases or other enzymes. For example, structural homology between fungal amylases and bacterial (Bacillus) 
amylase have been suggested and, therefore, fungal amylases are encompassed within the present invention. 

An alpha-amylase mutant has an amino acid sequence which is derived from the amino acid sequence of a pre- 
cursor alpha-amylase. The precursor alpha-amylases include naturally occurring alpha-amylases and recombinant 
alpha-amylases (as defined). The amino acid sequence of the alpha-amylase mutant is derived from the precursor 

20 alpha-amylase amino acid sequence by the substitution, deletion or insertion of one or more amino acids of the pre- 
cursor amino acid sequence. Such modification is of the precursor DNA sequence which encodes the amino acid 
sequence of the precursor alpha-amylase rather than manipulation of the precursor alpha-amylase enzyme perse. 
Suitable methods for such manipulation of the precursor DNA sequence include methods disclosed herein and in US 
patent 4,760,025 and 5,185,258 

25 Specific residues corresponding to positions M1 5 and W1 38 of Bacillus licheniformis alpha-amylase are identified 

herein for substitution or deletion, as are all methionine, histidine, tryptophan, cysteine and tyrosine positions. The 
amino acid position number (i.e., +197) refers to the number assigned to the mature Bacillus licheniformis alpha- 
amylase sequence presented in Fig. 2. The invention, however, is not limited to the mutation of this particular mature 
alpha-amylase (B. licheniformis) but extends to precursor alpha-amylases containing amino acid residues at positions 

30 which are equivalent to the particular identified residue in B. licheniformis alpha-amylase. A residue (amino acid) of a 
precursor alpha-amylase is equivalent to a residue of B. licheniformis alpha-amylase if it is either homologous (i.e.. 
corresponding in position in either primary or tertiary structure) or analogous to a specific residue or portion of that 
residue in B. licheniformis alpha-amylase (i.e., having the same or similar functional capacity to combine, react, or 
interact chemically or structurally). 

35 in order to establish homology to primary structure, the amino acid sequence of a precursor alpha-amylase is 

directly compared to the B. licheniformis alpha-amylase primary sequence and particularly to a set of residues known 
to be invariant to all alpha-amylases for which sequence is known, as seen in Fig. 3. It is possible also to determine 
equivalent residues by tertiary structure: crystal structures have been reported for porcine pancreatic alpha-amylase 
(Buisson, G. et al. (1987) EMBO J. 6:3909-391 6); Taka-amylase A from Aspergillus oryzae (Matsuura, Y. etal. (1984) 

40 j. Biochem. (Tokyo) 95:697-702); and an acid alpha-amylase from A. niger (Boel, E. et al. (1990) Biochemistry 29: 
6244-6249), with the former two structures being similar. There are no published structures for Bacillus alpha-amylases, 
although there are predicted to be common super-secondary structures between glucanases (MacGregor, E. A. & Sven- 
sson, B. (1989) Biochem. J. 259:145-152) and a structure for the B. stearothermophilus enzyme has been modeled 
on that of Taka-amylase A (Holm, L. et al. (1990) Protein Engineering 3:181-191). The four highly conserved regions 

45 shown in Fig. 3 contain many residues thought to be part of the active-site (Matsuura, Y. et al. (1984) J. Biochem. 
(Tokyo) 95:697-702; Buisson, G. etal. (1 987) EMBO J. 6:3909-3916; Vihinen, M. etal. (1990) J. Biochem. 107:267-272) 
including, in the licheniformis numbering. His105; Arg229; Asp231; His235; Glu261 and Asp32S. 

Expression vector as used herein refers to a DNA construct containing a DNA sequence which is operably linked 
to a suitable control sequence capable of effecting the expression of said DNA in a suitable host. Such control se- 

50 quences may include a promoter to effect transcription, an optional operator sequence to control such transcription, a 
sequence encoding suitable mRNA ribosome-binding sites, and sequences which control termination of transcription 
and translation. A preferred promoter is the B. subtilis aprE promoter. The vector may be a plasmid, a phage particle, 
or simply a potential genomic insert. Once transformed into a suitable host, the vector may replicate and function 
independently of the host genome, or may, in some instances, integrate into the genome itself. In the present specifi- 
cs cation, plasmid and vector are sometimes used interchangeably as the plasmid is the most commonly used form of 
vector at present. However, the invention is intended to include such other forms of expression vectors which serve 
equivalent functions and which are, or become, known in the art. 

Host strains (or cells) useful in the present invention generally are procaryotic or eucaryotic hosts and include any 
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transformable microorganism in which the expression of alpha-amylase can be achieved. Specifically, host strains of 
the same species or genus from which the alpha-amylase is derived are suitable, such as a Bacillus strain. Preferably 
an alpha-amylase negative Bacillus strain (genes deleted) and/or an alpha-arnylase and prot ase deleted Bacillus 
strain such as Bacillus subtifis strain BG2473 (AamyE,Aapr,Anpi) is used. Host cells are transformed or transfected 

5 with vectors constructed using recombinant DNA techniques. Such transformed host cells ar capable of either repli- 
cating vectors encoding the alpha-amylase and its variants (mutants) or expressing the desired alpha-amylase. 

Preferably the mutants of the present invention are secreted into the culture medium during fermentation. Any 
suitable signal sequence, such as the aprE signal peptide, can be used to achieve secretion. 

Many of the alpha-amylase mutants of the present invention are useful in formulating various detergent composi- 

10 tions, particularly certain dish care cleaning compositions, especially those cleaning compositions containing known 
oxidants. Alpha-amylase mutants of the invention can be formulated into known powdered, liquid or gel detergents 
having pH between 6.5 to 12.0. Suitable granular composition may be made as described in commonly owned US 
patent applications 07/4290,881 , 07/533,721 and 07/957,973- These detergent cleaning compositions can also contain 
other enzymes, such as known proteases, lipases, cellulases. endoglycosidases or other amylases, as well as builders, 

is stabilizers or other excipients known to those skilled in the art. These enzymes can be present as co-granules or as 
blended mixes or in any other manner known to those skilled in the art. Furthermore, it is contemplated by the present 
invention that multiple mutants may be useful in cleaning or other applications. For example, a mutant enzyme having 
changes at both +15 and +197 may exhibit enhanced performance useful in a cleaning product. 

As described previously, alpha-amylase mutants of the present invention may also be useful in the liquefaction of 

20 starch. Starch liquefaction, particularly granular starch slurry liquefaction, is typically carried out at near neutral pH's 
and high temperatures. As described in US applications 07/788,624 and 07/785,623 and US Patent 5,180,669, it ap- 
pears that an oxidizing agent or inactivating agent of some sort is also present in typical liquefaction processes, which 
may affect the enzyme activity; thus, in these related patent applications an antioxidant is added to the process to 
protect the enzyme. 

25 Based on the conditions of a preferred liquefaction process, as described in US applications 07/788,624 and 

07/785,623 and US Patent 5,180,669, namely low pH, high temperature and potential oxidation conditions, preferred 
mutants of the present invention for use in liquefaction processes comprise mutants exhibiting altered pH performance 
profiles (i.e., low pH profile, pH <6 and preferably pH <5.5), and/or altered thermal stability (i.e., high temperature, 
about 90°-110°C) l and/or altered oxidative stability (i.e., enhanced oxidative stability). 

30 Thus, an improved method for liquefying starch is taught by the present invention, the method comprising liquefying 

a granular starch slurry from either a wet or dry milling process at a pH from about 4 to 6 by adding an effective amount 
of an alpha-amylase mutant of the present invention to the starch slurry; optionally adding an effective amount of an 
antioxidant or other excipient to the slurry; and reacting the slurry for an appropriate time and temperature to liquefy 
the starch. 

35 The following is presented by way of example and is not to be construed as a limitation to the scope of the claims. 

Abbreviations used herein, particularly three letter or one letter notations for amino acids are described in Dale, J.W., 
Molecular Genetics of Bacteria, John Wiley & Sons, (1989) Appendix B. 

Experimental 

40 

Example 1 

Substitutions for the Methionine Residues in B.licheniformis Alpha-Amylase 

*s The alpha-amylase gene (Fig. 1 ) was cloned from B. licheniformis NCIB8061 obtained from the National Collection 

of Industrial Bacteria, Aberdeen, Scotland (Gray G. et al. (1986) J. Bacteriology 166:635-643). The 1.72kb Pstl-Sstl 
fragment, encoding the last three residues of the signal sequence; the entire mature protein and the terminator region 
was subcloned into M13MP18. A synthetic terminator was added between the Bell and Sstl sites using a synthetic 
oligonucleotide cassette of the form: 

so 

Bell Sstl 

5' GATCAAAACATAAAAAACCGGCCTTGGCCCCGCCGGTTTrTTATTATTTTTGAGCT 3' 

55 

3* TTTTGTA i I i I » » GGCCGGAACCGGGGCGGCCAAAAAATAATAAAAAC 5' 

Seq IO No 1 
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designed to contain the B. amyloliquefaciens subtilisin transcriptional terminator (W tls et al. (1983) Nucleic Acid Re- 
search 11:7911-7925). 

Site-directed mutagenesis by oligonucleotides used essentially the protocol of Zoller, M. et al. (1983) Meth. Enzy- 
mol. 100:468-500: briefly, S'-phosphorylated oligonucleotide primers were used to introduce the desired mutations on 
s the M13 single-stranded DNA template using the oligonucleotides listed in Table I to substitute for each of the seven 
methionines found in B. ficheniformis alpha-amylase. Each mutagenic oligonucleotide also introduced a restriction 
endonuclease site to use as a screen for the linked mutation. 

w TABLE I 

Mutagenic Oligonucleotides for the Substitution of rhe 
Methionine Residues in B. ficheniformis Alpha-Amvtase 

15 YJ*X 

5'-T GGG ACG CTG CCG CAGTACTTT GAA TGG T-3' Seq ID No 2 

Scal + 

M15L 

5'-TG ATG CAGTACTTT GAA T GG TAC CT G CCC AAT GA-3' Seq ID No 3 

20 — ScaV KpnT+ 

Ml 571* 

5'- GAT TAT TTG TTG TAT GCC GAT ATC GAC TAT GAC CAT-3' Seq ID No 4 

£coRV+ 

25 M256A o _ lt c 

5 f -CG GGG AAG G AG CCC T TT ACG GTA GCT-3 ' Seq ID No 5 

H304L 

5 r -GC GGC TAT GA C T?A AG G AAA TTG C-3 ' Seq ID No 6 

H366A 

5'-C TAC GGG G AT CCA T AC GGG ACG A-3' Seq ID No 1 

35 H3 6 5Y 

5'-C TAC GGG GAT TAC TAC GGG ACCAACGGA GAC TCC C-3* Seq ID No 8 

Styl+ 

M438A 

5'-CC GGT GG G GCC AAG CGG CCC TAT GTT GGC CGG CAA A-3' Seq ID No 9 
40 TTTT^ 



45 



50 



Bold letter indicate base changes introduced by oligonucleotide. 

Codon changes indicated in the form M8A, where methionine (M> at position +8 has 
been changed to alanine <A). 

Underlining indicates restriction endonuclease site introduced by oligonucleotide. 



The heteroduplex was used to transfect £ coli mutL cells (Kramer et al. (1984) Cell 38:879) and, after plaque- 
purification, clones were analyzed by restriction analysis of the RF1 's. Positives were confirmed by dideoxy sequencing 
(Sanger et al. (1977) Proc. Natl. Acad. Sci. U.S.A. 74:5463-5467) and the Pstl-Sstl fragments for each subcloned into 
55 an £ coli vector, plasm id pA4BL 
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Plasmid pA4BL 

Following the methods described in US application 860,468 (Power et al.), a silent Pstl site was introduced at 
condon + 1 (the first amino-acid following the signal cleavage site) of the aprE gene from pS168-1 (Stahl, M.L and 
Ferrari, E. (1984) J. Bacter. 158:411-418). Th aprE promoter and signal peptide region was then cloned out of a 
pJH101 plasmid (Ferrari, FA et al. (1983) J. Bacter. 154:1513-1515) as a Hindlll-Pstl fragment and subcloned into 
the pUC18-derived plasmid JM102 (Ferrari, E. and Hoch, J.A. (1989) Bacillus, ed. C.R. Harwood, Plenum Pub., pp. 
57-72). Addition of the Pstl-Sstl fragment from B. licheniformis alpha-amylase gave pA4BL (Fig. 5) having the resulting 
aprE signal peptide-amylase junction as shown in Fig. 6. 

Transformation Into B.subtilis 

pA4BL is a plasmid able to replicate in E coli and integrate into the B. subtilis chromosome. Plasmids containing 
different variants were transformed into B, subtilis (Anagnostopoulos, C. and Spizizen, J. (1 961 ) J. Bacter. 81 :741 -746) 
and integrated into the chromosome at the aprE locus by a Campbell-type mechanism (Young, M, (1984) J. Gen. 
Microbiol. 130:1613-1621). The Bacillus subtilis strain BG2473 was a derivative of 1168 which had been deleted for 
amylase (AamyE) and two proteases (Aapr t Anpr) (Stahl, M.L and Ferrari, E., J. Bacter, 158:411-418 and US Patent 
5,264,366, incorporated herein by reference). After transformation the sacU32(Hy) (Henner, D.J. et al. (1 988) J. Bacter. 
170:296-300) mutation was introduced by PBS-1 mediated transduction (Hoch, J.A. (1983) 154:1513-1515). 

N-terminal analysis of the amylase expressed from pA4BL in B. subtilis showed it to be processed having four 
extra alanines at the N-terminus of the secreted amylase protein ( P A4 form B ). These extra residues had no significant, 
deleterious effect on the activity or thermal stability of the A4 form and in some applications may enhance performance. 
In subsequent experiments the correctly processed forms of the licheniformis amylase and the variant M197T were 
made from a very similar construction (see Fig. 6). Specifically, the 5' end of the A4 construction was subcloned on an 
EcoRI-Sstll fragment, from pA4BL (Fig. 5) into M13BM20 (Boehringer Mannheim) in order to obtain a coding-strand 
template for the mutagenic oligonucleotide below: 



5 ' -CAT CAG CGT CCC ATT AAG ATT TGC AGC CTG CGC AG A CAT GTT 
GCT-3 ' 

Seq ID No 10 



This primer eliminated the codons for the extra four N-terminal alanines, correct forms being screened for by the 
absence of the Pstl site. Subcloning the EcoRI-Sstll fragment back into the pA4BL vector (Fig. 5) gave plasmid pBLapr. 
The M1 97T substitution could then be moved, on a Sstll-Sstl fragment, out of pA4BL (M1 97T) into the complementary 
pBLapr vector to give plasmid pBLapr (M1 97T). N-terminal analysis of the amylase expressed from pBLapr in B. subtilis 
showed it to be processed with the same N-terminus found in B. licheniformis alpha-amylase. 

Example 2 

Oxidative Sensitivity of Methionine Variants 

B. licheniformis alpha-amylase, such as Spezyme® AA20 (commercially available from Genencor International, 
Inc.), is inactivated rapidly in the presence of hydrogen peroxide (Fig. 7). Various methionine variants were expressed 
in shake-flask cultures of B. subtilis and the crude supernatants purified by ammonium sulphate cuts. The amylase 
was precipitated from a 20% saturated ammonium sulphate supernatant by raising the ammonium sulphate to 70% 
saturated, and then resuspended. The variants were then exposed to 0.88M hydrogen peroxide at pH 5.0, at 25°C. 
Variants at six of the methionine positions in B. licheniformis alpha-amylase were still subject to oxidation by peroxide 
while the substitution at position +197 (M197L) showed resistance to peroxide oxidation. (See Fig. 7.) However, sub- 
sequent analysis described in further detail below showed that while a variant may be susceptible to oxidation at pH 
5.0, 25 6 C, it may exhibit altered/enhanced properties under different conditions (i.e., liquefaction). 
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Example 3 

Construction of All Possible Variants at Position 1 97 

All of the M197 variants (M197X) were produced in the A4form by cassette mutagenesis, as outlined in Fig. 8: 

1) Site directed mutagenesis (via primer extension in M13) was used to make M1 97A using the mutagenic oligo- 
nucleotide below: 



10 



15 



M197A 

5 ' -GAT TAT TTG GCG TAT GCC GAT ATC GAC TAT GAC CAT- 3' 

ECORVr 

Clal - Seq ID No 11 



20 



25 



which also inserted an EcoRV site (codons 200-201 ) to replace the Clal site (codons 201 -202). (codons 201 -202). 

2) Then primer LAAM12 (Table II) was used to introduce another silent restriction site (BstBI) over codons 186-1 88. 

3) The resultant M197A (BstBI +, EcoRV +) variant was then subcloned (Pstl-Sstl fragment) into plasmid pA4BL 
and the resultant plasmid digested with BstBI and EcoRV and the large vector-containing fragment isolated by 
electroelution from agarose gel. 

4) Synthetic primers LAAM14-30 (Table II) were each annealed with the largely complementary common primer 
LAAM1 3 (Table II). The resulting cassettes encoded for all the remaining naturally occurring amino acids at position 
+1 97 and were ligated, individually, into the vector fragment prepared above. 



30 



TABLE II 

Synthetic Oligonucleotides Used for Cassette Mutagenesis 
to Produce M197X Variants 



35 



40 



45 



50 



55 



LAAM1 2 GG GAA GT T TCG AA T G AA A AC G Seq ID No 1 2 

LAAM13 X!97bs Seq ID No 13 

(EcoRV) £TC GGC AT A TG CAT ATA ATC ATA GTT GCC GTT TTC ATT (BstBI) 

LAAM14 1197 Seq ID No 1*1 

(BstBI) CG AAT GAA AAC GGC AAC TAT GAT TAT TTG AT£ TAT GCC GAQ (EcoRV-) 

LAAM15 F197 Seq ID No 15 

(BstBI) CG AAT GAA AAC GGC AAC TAT GAT TAT TTG TTC TAT GCC GAQ (EcoRV-) 

LAAM1 6 VI 97 Seq ID No 1 6 

(BstBI) CG AAT GAA AAC GGC AAC TAT GAT TAT TTG QTV TAT GCC GAQ (EcoRV-) 

LAAM17 S197 Seq ID No 17 

(BstBI) CG AAT GAA AAC GGC AAC TAT GAT TAT TTG AGC TAT GCC GAC (EcoRV-J 

LAAM18 P197 Seq ID No 18 

(BstBI) CG AAT GAA AAC GGC AAC TAT GAT TAT TTG CCT TAT GCC GAQ (EcoRV-) 

LAAM19 T197 Seq ID No 1 9 

(BstBI) CG AAT GAA AAC GGC AAC TAT GAT TAT TTG ACA TAT GCC GAC (EcoRV-} 

LAAM20 Y197 Seq ID No 20 

(BstBll CG AAT GAA AAC GGC AAC TAT GAT TAT TTG TAC TAT GCC GAC (EcoRV ) 
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10 



15 



20 



25 



AM21 H197 Seq ID No 21 

(BsrBII CG AAT GAA AAC GGC AAC TAT GAT TAT TTG CAC TAT GCC GAC lEcoRVO 

AM22 G197 Seq ID No 22 

l8s;Bl! CG AAT GAA AAC GGC AAC TAT GAT TAT TTG QGC TAT GCC GAC (EcoRV-) 

J\M23 Q197 Seq ID No 23 

(BstBli CG AAT GAA AAC GGC AAC TAT GAT TAT TTG QAA TAT GCC GA£ (EcoRV-) 

vAM24 N ! 97 Seq ID No 24 

(BstBI) CG AAT GAA AAC GGC AAC TAT GAT TAT TTG AAC TAT GCC GAC lEcoRV-) 

UXM25 K197 Seq ID No 25 

(BstBll CG AAT GAA AAC GGC AAC TAT GAT TAT TTG AAA TAT GCC GAQ (EcoRV-) 

\AM26 D197 Seq ID No 25 

IBstBH CG AAT GAA AAC GGC AAC TAT GAT TAT TTG GAT TAT GCC GAC (EcoRV-l 

\AM27 E197 Seq ID No 27 

IBstBII CG AAT GAA AAC GGC AAC TAT GAT TAT TTG GAA TAT GCC GAQ (EcoRV-) 

AAM28 C197 Sea ID No 23 

(BstBH CG AAT GAA AAC CGC AAC TAT GAT TAT TTG JQT TAT GCC GAQ (EcoRV-j 

AAM29 W197 Seq ID No 29 

(BstBI) CG AAT GAA AAC GGC AAC TAT GAT TAT TTG TGG TAT GCC GAC (EcoRV-l 

AAM30 R197 Seq 10 No 30 

(BstBI) CG AAT GAA AAC GGC AAC TAT GAT TAT TTG AGA TAT GCC GA£ (EcoRV-] 



The cassettes were designed to destroy the EcoRV site upon ligation, thus plasmids from E. coli transformants 
were screened for loss of this unique site. In addition, the common bottom strand of the cassette contained a f rame- 
30 shift and encoded a Nsil site, thus transformants derived from this strand could be eliminated by screening for the 
presence of the unique Nsil site and would not be expected, in any case, to lead to expression of active amylase. 

Positives by restriction analysis were confirmed by sequencing and transformed in B. subtilis for expression in 
shake-flask cultures. The specific activity of certain of the M197X mutants was then determined using a soluble sub- 
strate assay. The data generated using the following assay methods are presented below in Table III. 

35 

Soluble Substrate Assay : 

A rate assay was developed based on an end-point assay kit supplied by Megazyme (Aust.) Pty. Ltd.: Each vial 
of substrate (g-nitrophenyl maltoheptaoside, BPNPG7) was dissolved in 10ml of sterile water, followed by a 1 to 4 
*o dilution in assay buffer (50mM maleate buffer, pH 6.7, 5mM calcium chloride, 0.002% Tween20). Assays were per- 
formed by adding 10ji€ of amylase to 790|i€ of the substrate in a cuvette at 25°C. Rates of hydrolysis were measured 
as the rate of change of absorbance at 410nm, after a delay of 75 seconds. The assay was linear up to rates of 0.4 
absorption units/min. 

The amylase protein concentration was measured using the standard Bio-Rad assay (Bio-Rad Laboratories) based 
45 on the method of Bradford. M. (1976) Anal. Biochem. 72:248) using bovine serum albumin standards. 

Starch Hydrolysis Assay : 

The standard method for assaying the alpha-amylase activity of Spezyme® AA20 was used. This method is de- 
50 scribed in detail in Example 1 of USSN 07/785,624, incorporated herein by reference. Native starch forms a blue color 
with iodine but fails to do so when it is hydrolyzed into shorter dextrin molecules. The substrate is soluble Lintner starch 
5gm/liter in phosphate buffer, pH 6.2 (42.5gm/liter potassium dihydrogen phosphate, 3.16gm/liter sodium hydroxide). 
The sample is added in 25mM calcium chloride and activity is measured as the time taken to give a negative iodine 
test upcn incubation at 30°C. Activity is recorded in liquefons per gram or ml (LU) calculated according to the formula: 

55 

LU/ml or LU/g = ^^jX D 



10 



EP 0 867 504 A1 



10 



15 



20 



25 



30 



35 



40 



45 



50 



Where 



LU = liquefon unit 

V = volume of sample (5ml) 

t = dextrinization time (minutes) 

D = dilution factor = dilution volume/ml or g of added enzyme. 

TABLE III 



ALPHA-AMYLASE 


SPECIFIC ACTIVITY (as % of AA20 value) on: 


Soluble Substrate 


Starch 


Spezyme® AA20 


100 


100 


A4 form 


105 


115 


M15L (A4 form) 


93 


94 


M15L 


85 


103 


M197T (A4form) 


75 


83 


M197T 


62 


81 


M197A (A4 form) 


88 


89 


M197C 


85 


85 


M197L (A4 form) 


51 


17 



Example 4 

Characterization of Variant M15L 

Variant M15L made as per the prior examples did not show increased amylase activity (Table III) and was still 
inactivated by hydrogen peroxide (Fig. 7). It did, however, show significantly increased performance in jet-liquefaction 
of starch, especially at low pH as shown in Table IV below. 

Starch liquefaction was typically performed using a Hydroheater M 103-M steam jet equipped with a 2.5 liter delay 
coil behind the mixing chamber and a terminal back pressure valve. Starch was fed to the jet by a Moyno pump and 
steam was supplied by a 150 psi steam line, reduced to 90-100 psi. Temperature probes were installed just after the 
Hydroheater jet and just before the back pressure valve. 

Starch slurry was obtained from a corn wet miller and used within two days. The starch was diluted to the desired 
solids level with deionized water and the pH of the starch was adjusted with 2% NaOH or saturated NagC0 3 . Typical 
liquefaction conditions were: 



Starch 


32%-35% solids 


Calcium 


40-50 ppm (30 ppm added) 


pH 


5.0-6.0 


Alpha-amylase 


12-14 LU/g starch dry basis 



55 



Starch was introduced into the jet at about 350 ml/min. The jet temperature was held at 105°-107°C. Samples of 
starch were transferred from the jet cooker to a 95°C second stage liquefaction and held for 90 minutes. 

The degree of starch liquefaction was measured immediately after the second stage liquefaction by determining 
the dextrose equivalence (DE) of the sample and by testing for the presence of raw starch, both according to the 
methods described in the Standard Analytical Methods of the Member Companies of the Corn Refiners Association, 
Inc. , sixth edition. Starch, when treated generally under the conditions given above and at pH 6.0, will yield a liquefied 
starch with a DE of about 10 and with no raw starch. Results of starch liquefaction tests using mutants of the present 
invention are provided in Table IV. 
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TABLE IV 



10 



15 



Performance of 
Variants M15L (A4 form) and M15L in Starch Liquefaction 




pH 


DE after 90 Mins. 


Spezyme® AA20 


5.9 


9.9 


M15L (A4 form) 


5.9 


10.4 


Spezyme® AA20 


5.2 


1.2 


M15L (A4 form) 


5.2 


2.2 


Spezyme® AA20 


5.9 


9.3* 


M15L 


5.9 


11,3* 


Spezyme® AA20 


5.5 


3.25** 


M15L 


5.5 


6.7** 


Spezyme® AA20 


5.2 


0.7** 


M15L 


5.2 


3.65** 



20 



*averag© of three experiments 
** average of two experiments 



25 



30 



Example 5 

Construction of M15X Variants 

Following generally the processes described in Example 3 above, all variants at M15 (M15X) were produced in 
native B. ticheniformis by cassette mutagenesis, as outlined in Fig. 8. 

1) Site directed mutagenesis (via primer extension in M1 3) was used to introduce unique restriction sites flanking 
the M15 codon to facilitate insertion of a mutagenesis cassette. Specifically, a BstB1 site at codons 11-13 and a 
Msc1 site at codons 18-20 were introduced using the two oligonucleotides shown below. 



35 



MISXBstBl 5'-G ATG CAG TAT TTC GAA CTGG TAT A-3 ' 

BstBl 



Seq ID No 48 



40 



M15XMSC1 5 ' -TG CCC AAT GA T GGC CA A CAT TGG AAG-3 ' 

Mscl 



Seq ID No 49 



45 



so 



55 



2) The vector for M15X cassette mutagenesis was then constructed by subcloning the Sfil -Sstll fragment from 
the mutagenized amylase (BstBl +, Msc1+) into plasmid pBLapr. The resulting plasmid was then digested with 
BstB! and Msc1 and the large vector fragment isolated by electroelution from a polyacrylamide gel. 

3) Mutagenesis cassettes were created as with the M197X variants. Synthetic oligomers, each encoding a sub- 
stitution at codon 1 5, were annealed to a common bottom primer. Upon proper ligation of the cassette to the vector, 
the Msc1 is destroyed allowing for screening of positive transformants by loss of this site. The bottom primer 
contains an unique SnaB1 site allowing for the transformants derived from the bottom strand to be eliminated by 
screening for the SnaB1 site. This primer also contains a frameshift which would also eliminate amylase expression 
for the mutants derived from the common bottom strand. 

The synthetic cassettes are listed in Table V and the general cassette mutagenesis strategy is illustrated in Figure 8. 
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TABLE V 

Synthetic ol ioorvjc leot ides Used foe Cassette Mutagenesis 
5 r;o Produce M15X Variants 





15A 


( BstBl ) 


C 


GAA 


TGG 


TAT 


CCT 


CCC 


AAT 


CAC 


CC 


(Mscl) 


Seq 


ID 


NO 


SO 




15R 


(B3tBl) 


c 


GAA 


TGC 


TAT 


CGC 


ccc 


AAT 


CAC 


GC 


(Mscl) 


Seq 


ID 


NO 


SI 


10 


15N 


( BstBl ) 


c 


GAA 


TCG 


TAT 


AAJ 


ccc 


AAT 


CAC 


GC 


(Mscl) 


Seq 


ID 


No 


52 




15D 


( BatBl ) 


c 


GAA 


TGG 


TAT 


CAT 


ccc 


AAT 


CAC 


CC 


(Mscl) 


Seq 


ID 


NO 


53 




.15H 


(BatBl) 


c 


GAA 


TGC 


TAT 


CAC 


ccc 


AAT 


CAC 


GC 


(Mscl) 


Ssq 


10 


NO 


54 


15 


!15K 


(BstBl) 


c 


GAA 


TGG 


TAT 


AAA 


ccc 


AAT 


CAC 


GC 


(Mscl) 


Seq 


ID 


No 


55 




U5P 


(BstBl) 


c 


GAA 


TGG 


TAT 


CCG 


ccc 


AAT 


GAC 


GC 


(Mscl) 


Seq 


ID 


NO 


56 




(155 


(BstBl) 


c 


GAA 


TGG 


TAT 


TCT 


ccc 


AAT 


GAC 


GC 


(Mscl) 


Seq 


ID 


NO 


57 


20 


(1ST 


(BstBl ) 


c 


GAA 


TCG 


TAC 


ACT 


ccc 


AAT 


CAC 


CC 


(Mscl) 


Seq 


ID 


No 


53 




USV 


(BstBl) 


c 


GAA 


TCG 


TAT 


CTT 


ccc 


AAT 


CAC 


GC 


(MSCi ) 


Seq 


ID 


No 


59 




^15C 


(BstBl) 


c 


GAA 


TGG 


TAT 


TCT 


ccc 


AAT 


CAC 


CG 


(MSCl) 


Seq 


ID 


NO 


60 


25 


M1SQ 


(BstBl ) 


c 


GAA 




TAT 


CAA 


ccc 


AAT 


CAC 


GC 


(MSCl) 


Seq 


ID 


No 


61 




M1SE 


(BstBl) 


c 


GAA 


TCG 


TAT 


GAA 


ccc 


AAT 


CAC 


GG 


(MSCl ) 


Seq 


ID 


NO 


62 




M1SC 


(BstBl] 


c 


GAA 


TCG 


TAT 


CCT 


ccc 


AAT 


CAC 


CG 


(Mscl) 


Seq 


ID 


No 


63 


30 


M15I 


(BstBl] 


c 


GAA 


TCC 


TAT 


ATT 


ccc 


AAT 


GAC 


GG 


(MSCl ) 


Seq 


ZD 


No 


64 




M15F 


(BstBl] 


c 


GAA 


TCG 


TAT 


TTT 


ccc 


AAT 


CAC 


GG 


(MSCl) 


Seq 


ID 


NO 


65 




M1SW 


(BstBl] 


c 


GAA 


TCC 


TAC 


TCG 


ccc 


AAT 


CAC 


GG 


(MSCl) 


Seq 


ID 


No 


66 


35 


M15Y 


(BstBl] 


c 


GAA 


TCG 


TAT 


TAT 


ccc 


AAT 


CAC 


GG 


(Mscl) 


Seq 


ID 


NO 


67 




M15X 


(Mscl) 


cc 


CTC 


ATT 


ccc 


ACT 


ACG 


TAC 


CAT 


T 


(BstBl ) 


Seq 


ID 


NO 


68 



(bottom strand) 



Underline indicates codon changes at amino acid position 15. 

Conservative substitutions were made in some cases to prevent introduction 
of new restriction sites. 

Example 6 

Bench Liquefaction with M15X Variants 

Eleven alpha-amylase variants with substitutions for M15 made as per Example 5 were assayed for activity, as 
compared to Spezyme® AA20 (commercially available from Genencor International, Inc.) in liquefaction at pH 5.5 
using a bench liquefaction system. The bench scale liquefaction system consisted of a stainless steel coil (0.25 inch 
diameter, approximately 350 ml volume) equipped with a 7 inch long static mixing element approximately 12 inches 
from the anterior end and a 30 psi back pressure valve at the posterior end. The coil, except for each end, was immersed 
in a glycerol-water bath equipped with thermostatically controlled heating elements that maintained the bath at 
105-106°C. 

Starch slurry containing enzyme, maintained in suspension by stirring, was introduced into the reaction coil by a 
piston driven metering pump at about 70 ml/min. Th starch was recovered from the end of the coil and was transferred 
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to the secondary hold (95°C for 90 minutes). Immediately after the secondary hold, the DE of the liquefied starch was 
determined, as described in Example 4. The results are shown in Fig. 12. 

Example 7 

5 

Characterization of M1 5X Variants 

All M15X variants were propagated in Bacillus subtilis and the expression level monitored as shown in Fig. 9. The 
amylase was isolated and partially purified by a 20-70% ammonium sulfate cut. The specific activity of these variants 

10 on the soluble substrate was determined as per Example 3 (Fig. 1 0). Many of the M1 5X amy lases have specific activities 
greater than that of Spezyme® AA20. A benchtop heat stability assay was performed on the variants by heating the 
amylase at 90°C for 5 min. in 50 mM acetate buffer pH 5 in the presence of 5 mM CaCI 2 (Fig. 11 ). Most of the variants 
performed as well as Spezyme® AA20 in this assay. Those variants that exhibited reasonable stability in this assay 
(reasonable stability defined as those that retained at least about 60% of Spezyme® AA20's heat stability) were tested 

15 for specific activity on starch and for liquefaction performance at pH 5.5. The most interesting of those mutants are 
shown in Fig. 16. M15D, N and T, along with L outperformed Spezyme® AA20 in liquefaction at pH 5.5 and have 
increased specific activities in both the soluble substrate and starch hydrolysis assays. 

Generally, we have found that by substituting for the methionine at position 15, we can provide variants with in- 
creased low pH-liquefaction performance and/or increased specific activity. 

20 

Example 8 

Tryptophan Sensitivity to Oxidation 

25 Chloramine-T (sodium N-chloro-p-toluenesulfonimide) is a selective oxidant, which oxidizes methionine to methio- 

nine sulfoxide at neutral or alkaline pH. At acidic pH, chloramine-T will modify both methionine and tryptophan (Schech- 
ter, Y., Burstein, Y. and Patchornik, A. (1975) Biochemistry 14(20) 4497-4503). Fig. 13 shows the inactivation of B. 
licheniformis alpha-amylase with chloramine-T at pH 8.0 (AA20 = 0.65 mg/ml, M1 97A = 1 .7 mg, ml, M1 97L = 1 .7 mg/ 
ml). The data shows that by changing the methionine at position 197 to leucine or alanine, the inactivation of alpha- 

30 amylase can be prevented. Conversely, as shown in Fig. 14, at pH 4.0 inactivation of the M1 97A and M1 97L proceeds, 
but require more equivalents of chloramine-T (Fig. 18; AA20 = 0.22 mg/ml, M197A = 4.3 mg/ml, M197L = 0,53 mg/ml; 
200 mM NaAcetate at pH 4.0). This suggests that a tryptophan residue is also implicated in the chloramine-T mediated 
inactivation event Furthermore, tryptic mapping and subsequent amino acid sequencing indicated that the tryptophan 
at position 138 was oxidized by chloramine-T (data not shown). To prove this, site-directed mutants were made at 

35 tryptophan 1 38 as provided below: 

Preparation of Alpha-Amylase Double Mutants W138 and M197 

Certain variants of W1 38 (F, Y and A) were made as double mutants, with M197T (made as per the disclosure of 
Example 3). The double mutants were made following the methods described in Examples 1 and 3. Generally, single 
negative strands of DNA were prepared from an M13MP18clone of the 1.72kb coding sequence (Pst l-Sst I) of the B. 
licheniformis alpha-amylase M197T mutant. Site-directed mutagenesis was done using the primers listed below, es- 
sentially by the method of Zoller, M. et al. (1983) except T4 gene 32 protein and T4 polymerase were substituted for 
klenow. The primers all contained unique sites, as well as the desired mutation, in order to identify those clones with 
4 5 the appropriate mutation. 



Tryptophan 138 to Phenylalanine 

SO 133 134 135 136 137 13fl 133 140 141 142 143 

CAC CTA ATT AAA GCT T TC ACA CAT TTT CAT TTT Sea 10 No 42 

Hind III 



55 
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Tryptophan 138 to Tvrosir.e 

133 134 135 136 137 138 12S 140 141 142 143 

CAC CTA ATT A AA CCT T AC AC A CAT TTT CAT TTT Seq ID No 43 

Hind III 



Tryptophan 13 8 to Alanine - This primer also engineers unique sites 
10 upstream and downstream zt the 133 position. 

127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 
C CCC CTA ATT TCC CGA CAA CAC CTA ATT AAA CCC CCA ACA CAT TTT CAT 
BspE I 

IS 

143 144 145 146 147 

TTT CCC CCC CCC CGC AG Seq ID No 44 

Xma I 

20 

Mutants were identified by restriction analysis and W1 3BF and W1 38Y confirmed by DN A sequencing. The W1 33A 
sequence revealed a nucleotide deletion between the unique BspE I and Xma I sites, however, the rest of the gene 
sequenced correctly. The 1.37kb Sstll/Sstl fragment containing both W138X and M197T mutations was moved from 
M13MP18 into the expression vector pBLapr resulting in pBLapr (W138R M197T) and pBLapr (W138Y, M197T). The 
25 fragment containing unique BspE I and Xma I sites was cloned into pBLapr (BspE I. Xma I, M197T) since it is useful 
for cloning cassettes containing other amino acid substitutions at position 138. 

Single Mutations at Amino Acid Position 138 

30 Following the general methods described in the prior examples, certain single variants of W1 38 (F, Y, L, H and C) 

were made. 

The 1 .24kb Asp71 8-Sstl fragment containing the M1 97T mutation in plasmid pBLapr (W1 38X, M1 97T) of Example 
7 was replaced by the wild-type fragment with methionine at 197, resulting in pBLapr (W138F), pBLapr (W138Y) and 
pBLapr (BspE I. Xma I). 

35 The mutants W138L, W138H and W1 38C were made by ligating synthetic cassettes into the pBLapr (BspE I, Xma 

I) vector using the following primers: 
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Tryptophan 138 to Leucine 

CC GGA GAA CAC CTA ATT AAA GCC CTA ACA CAT TTT CAT TTT C 

Seq ID No 45 

Tryptophan 138 to Histidine 

CC GGA GAA CAC CTA ATT AAA GCC CAC ACA CAT TTT CAT TTT C 

Seq ID No 4 6 



Tryptophan 138 to Cysteine 
5 * CC GGA GAA CAC CTA ATT AAA GCC TGC ACA CAT TTT CAT TTT C 

Seq ID No 47 



15 



10 
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Reaction of the double mutants M197T/W138F and M197T/W138Y with chloramine-T was compared with wild- 
type (AA20 = 0.75 mg/ml, M197T/W138F = 0.64 mg/ml, M197T/W138Y = 0.60 mg/ml; 50 mM NaAcetate at pH 5.0). 
The results shown in Fig. 19 show that mutagenesis of tryptophan 138 has caused the variant to be more resistant to 
chloramine-T. 

Annex to the description 

SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

U) APPLICANT: GENSNCOR I NTERNT I ON AL , INC. 

is Ui> title or invzntion : - Mutant Alpha-Amy lase 

(iii) NUM3SR Or SEQUENCES: ^2 

( iv ) CORRESPONDENCE ADDRESS : 

(A) ADDRESSES: Genencor International, Inc. 

20 «;> "^r : itoch^ter d9e Place ' 1870 winton S""* 

(D) STATE; £jy 

(E) COUNTRY: USA 

(F) ZIP: 14618 

(v) COMPUTER READA3LE FORM: 
25 (A) MEDIUM TYPE: Floppy disk 

(3) COMPUTER: I3M PC compatible 

(C) OPERATING SYSTEM: PC -DOS / MS-DOS 

(0) SOFTWARE: Patentln Release *1.0, version *1 . 25 

(VI) CURRENT APPLICATION DATA : 
30 (A) APPLICATION NUMBER: 

(3) FILING CATS: 
(C) CLASSIFICATION: 

(viii) ATTORNEY / AGENT INFORMATION: 

(A) name: »arcn C feUcbck 

(B) REGISTRATION NUH3ER: J J a U 

35 tO REFERENCE/ DOCKET NUM3ER: 44411/400 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 44 171 404 5921 

(B) TELEFAX: 44 l7l Q 31 i76 q 
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<2) INFORMATION FOR SZQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : CN A (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
GATCAAAACA TAAAAAACCG GCCTTGGCCC CCCCGGTTTT TTATTATTTT TGAGCT 5 6 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS : 
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<A) LENCTH: 15 oase pairs 

{ 9 ) TYPE: r.'^cle.c acid 

<C) STRAN0E2NSSS : single 

(D) TOPOLCCY: linear 

(ii) MOLECULE TYPE: ONA {genomic} 



(xi) SEQUENCE DESC? I ?T ION : SEQ ID NO: 2 
TGGGACGCTC GCGCAGTACT 7TGAATGG7 
(2) INFORMATION FOR SE2 10 NO: 3: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: r.u::e:c acid 

(C) STRAND EC NESS : single 
( 0 ) TOPOLOGY: linear 

(ii) MOLECULE TYPE : CNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 

TGATGCACTA CT7TCAATGG 7ACC7CCCCA ATGA 

{ 2 > INFORMATION FOB Si; ID NO: 4: 

( i ) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 3» basa pairs 
(8) TYPE: nucleic acid 

(C) 5TRAN-ECNE5S : single 

(D) TOPOLCC*:: linear 

(ii) MOLECULE TiPE: SNA (genomic) 



(xl) SEQUENCE CESCP.I PTION : SEQ ID NO: 4 

GATTATTTGT TGTATGCCSA 7A7CGACTAT GACCAT 

(2) INFORMATION FOP. SZQ ID NO : 5 : 

(i) SEQUENCE CKA?J\CTERISTICS: 
(A) LENGTH: 25 base pairs 
(3) TYPE: nucleic acid 
(C) 5TRANCECNESS ; single 
(0) TOPOLCC: : linear 

(ii) MOLECULE Tr ?! : SNA (genomic) 



(xi) SEQUENCE Z Z S :?. I ?T ION : SEQ ID NO:5 

CGGGGAAGG A GCCCT77AC j CTAGCT 

(2) INFORMATION FOR SIQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH; 2; base pairs 
(3) TYPHI: .-.-cleic acid 
(C) STRANCECNE5S : single 
(3) TOPOLCC: : linear 
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(ii) MOLECULE TY?£: SNA (geaooiic) 



(XL) SEQUENCE DESCRIPTION : SEQ I D NO: 6 

GCCCCTATGA CTTAACGAAA TTGC 

(2) INFORMATION FOR SIC 12 NO: 7: 

( i ) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 oase pairs 
(S) TYPE: r.ucle.c acid 
(C) STRA:OE:NES5: single 
( 0 ) TOPOLOGY: linear 

(ii) MOLECULE T??Z : CNA (genomic) 



(xl) SEQUENCE DESCRIPTION : SEQ ID NO: 7 

CTACGCGGAT GCATACGGGA CGA 

(2) INFORMATION FOR SEQ ID NO : 3 : 

( i ) SEQUENCE CHA7-ACTERISTICS: 
(A) LENGTH: 2 z aase pairs 
(S) TYPE: r.-rleic acid 

( C) STRAN:=::nESS : single 

( D ) TOPOLCC : : linear 

(ii) MOLECULE TY?E: DSA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 8 

CTACGGCCAT TACTACCGGA CCAACCCAGA CTCCC 

(2) INFORMATION FOR SEQ ZD NO:9: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 36 base pairs 
(3) TYPE: r.-r;e:c acid 

(C) STRANGENESS : single 

(D) TOPOLCC:: linear 

(ii) MOLECULE ?'.'?! : ONA (genomic) 



(xi) SEQUENCE C CSC?. I ?? ION : SEQ ID NO : 9 

CCGG7GGGGC CAACCCGGCC TATGTTGGCC GGCAAA 

(2) INFORMATION PC? 31^ 10 NO: 10: 

ii) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 5 base pairs 
(3i TYPE: r.urleic acid 

(C) STRA:;:iZSE5S : single 

(D) TOPOICCY: linear 

(:i) MOLECLLE 7;?E: DNA (genomic) 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 
CATCAGCCTC CCATTAAGAT TTCCACCCTG CGCAGACATG 
(2) INFORMATION FOR 5E2 ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH ; 36 base pairs 

( B ) TYPE: nucleic acid 
<C) STRANCEDNESS: single 
(D) TOPOLCCY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 
GATTATTTGG CGTATGCCCA TATCGACTAT CACCAT 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: r.-jrleic acid 
(C i STAANCECNESS: single 
(01 TOPOLCCY: linear 

{ii) MOLECULE T : ? Z : DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 

CCGAAGTTTC GAATGAAAAC 3 

C2) INFORMATION FC? SEQ ID NO: 13 : 

( i ) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 3 3 base pairs 
(3) TYPE: nucleic acid 
(C.i STRANZEDNSSS: smgie 
(0) TOPCLCC t : linear 

(ii) MOLECULE TY ? E : DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 13 

GTCGGCATAT GCATATAATC ATACTTGCCG TTTTCATT 

(2) INFORMATION TO? SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
(Ai LENGTH : 41 Dase pairs 
(31 TYPE : nucleic acid 
(Ci STRANCECNESS : single 
(Dt TOPCLCC : : linear 

(ii) MOLECULE C:?E: ONA (genomic) 



(xi) SEQUENCE Z *S CR I PT ION' : SEQ ID NO: 14 
CGAATCAAAA CCCC.i--.CT-~ CATTATTTGA TCTATCCCCA 
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(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 41 base pairs 
(3) TYPE: r.ucleic acid 
(C) 5TRANCEDNE5S : single 
fO) TOPOLOGY; linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CGAATGAAAA CCCCAACCAT GATTATTTGT TCTATGCCGA C 41 

t5 (2) INFORMATION" FOR SEQ ID NO:16: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 41 base pairs 
{ B ) TYPE: nucleic acid 
(C) STRAN^EDNESS : single 
2Q (D) TOPO'-CGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE INSCRIPTION : SEQ ID NO: 16: 

25 

CGAATGAAAA CGCCAACTAT GATTATTTGG TTTATGCCGA C 41 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i> SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 41 base pairs 

(3) TYPE: r.ucleic acid 
(C) STRAN- E3NESS : Single 
(0) TOPCLCGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 17: 

CGAATGAAAA CGCCAACTAT GATTATTTGA GCTATGCCGA C 41 

(2) INFORMATION FOP SEQ ID NO: 18: 

<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 41 base pairs 
(3) TYPE: nucleic acid 

(C) SCRAN2ECNESS ; single 

(D) TOPCLCGY: linear 

(ii) MOLECULE T??E: DNA (genomic) 



(xi) SEQUENCE INSCRIPTION : SEQ IC NO: IS: 
CGAATGAAAA CGCCAACTAT GATTATTTGC CTTATGCCGA C 
(2) INFORMATION ?Z? SEQ ID NO: 19: 
i.\ SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 41 base pai.cs 
{ 5 ) TYPE: nucleic acid 
<C> STRANOEONSSS: single 
(0) TOPOLOGY: linear 

lii) MOLECULE TYPE: DNA (genomic) 



(Xij SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

CGAATGAAAA CGGCAACTAT GAT7ATTTGA CATATCCCGA C 

(2) INFORMATION FOR SEC ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 41 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

CGAATGAAAA CGGCAACTAT CAT7ATTTGT ACTATGCCGA C 

(2) INFORMATION FOR SH; Z0 NO : 2 1 : 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: - 1 base pairs 
(3) TYPE: nucleic acid 

(C) STRAN3E-NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 



(xij SEQUENCE DESCRIPTION: SEQ ID NO:21: 

CGAATGAAAA CCCCAACTA7 CATTATTTGC ACTATGCCGA C 

(2) INFORMATION FOR SEQ IU NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 41 base pairs 
O) TYPE: nucleic acid 

(C) STRANO^N'ESS: single 

(D) TOPOLOGY: linear 

lii) MOLECULE TYPE: SNA (genomic) 



{xi) SEQUENCE DESCRIPTION; SEQ 10 NO;22: 

CGAATGAAAA CGGCAACTAT GATTATTTGG GCTATGCCGA C 

(2) INFORMATION FOR SZQ ID NO: 23: 

(LI SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 1 base pairs 
(3) TYPE: -ucleic acid 

(C) STRANIESSESS: singLe 

(D) TOPOLCCY : linear 
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(ii) MOLECULE TYPE : DNA (genomic) 



(xi) SEQUENCE! DESCRIPTION : SEQ ID NO:23: 

CGAATGAAAA CCGCAACTAT CATTATTTCC AATATGCCGA C 

(2) INFORMATION FOR SEQ 10 NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 41 base pairs 
(9) TYPE: nucleic acid 
{ C) STRANDEDNSSS: single 
(D) TOPOLCCY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

CGAATGAAAA CCCCAACTAT GATTATTTGA ACTATGCCCA C 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 1 base pairs 
(3) TYPE: r.u::eic acid 

(C) STRANEEDNESS : single 

(D) TOPOLOGY: linear 

[ ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 25: 
GCAATGAAAA CCCCAACTAT GATTATTTGA AATATGCCGA C 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 
(DJ TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE ASCRIPTION : SEQ ID NO: 26: 

CGAATGAAAA CGGCAACTAT GATTATTTGG ATTATGCCGA C 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 41 base pairs 
(3) TYPE** nucleic acid 
(C) STRANSETNESS: single 
( o ) TOPOLOCY : Linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
CGAATGAAAA CCGCAACTAT GATTATTTGG AATATGCCGA C 41 
5 (2) INFORMATION FOR SEQ 10 NO: 28: 

( i ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
io (0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ 10 NO:28: 

CGAATGAAAA CCGCAACTAT GATTATTTCT GTATTCCCGA C 41 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 41 base pairs 
( 8 ) TYPE: r.ucleic acid 

(C) STRANDEONSSS: single 

(D) TOPOLCGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CGAATGAAAA CCGCAACTAT GATTATTTCT GCTATCCCGA C 41 
(2) INFORMATION FOP. SEQ ID NO: 30: 



[i> SEQUENCE CHARACTERISTICS: 

(A) LENCTK: 41 base pairs 

(B) TYPE: r.ucleic acid 
35 (C) strandedness: single 

(0) TOPCLCCY: linear 

(it) MOLECULE TYPE: DNA (genomic) 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 

CGAATGAAAA CCGCAACTAT GATTATTTGA GATATCCCGA C 41 

(2) INFORMATION FOP SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19S8 base pairs 
(3) TYPE: r.ucleic acid 
(C) STRAN:EDNESS: single 
(0) TOPOLDG:: linear 

Hi) MOLECULE T:?E: DNA (genomic) 

(xi) SEQUENCE INSCRIPTION: SEQ ID NO : 3 1 : 
ACCTTGAAGA AG7C A.-.3r_-.3 CAGAG AGGCT ATTGAATAAA TGAGTAGAAA GCGCCATATC 50 



23 



EP 0 867 504 A1 

« 





CCCGCTTTTC 


T7TTGCAAGA 


AAATATAGCG 


AAAATCGTAC 


TTGTTAAAAA 


T7CGCAATA7 


120 




TTATACAACA 


TCATATG77T 


CACATTGAAA 


GGGGAGGAGA 


A7CATGAAAC 


AACAAAAACG 


iao 


5 


CCTTTACCCC 


CGATTCCTGA 


CGCTCTTAT7 


TGCCCTCATC 


TTCXTGCTGC 


C7CATTCTCC 


240 




ACCACCCGCG 


CCAAATC77A 


A7GGGACCC7 


GATGCAGTAT 


777GAA7GGT 


ACATGCCCAA 


300 




TCACGGCCAA 


CATTCGAACC 


G7TTCCAAAA 


CGACTCGCCA 


TATTTGGCTG 


AACACGGTA7 


360 


10 


TACTGCCGTC 


TGGATTCCC: 


CGGCATATAA 


GGGAACGAGG 


CAAGCGGA7G 


7GGCCTACGG 


420 




TCCTTACGAC 


CT77A7GA77 


7ACCGGACT7 


TCATCAAAAA 


GGGACGGT7C 


GCACAAAGTA 


430 




CGGCACAAAA 


GCACACC7C: 


AATCTGCGA7 


CAAAAGTCTT 


CA77CCCGCG 


ACAT7AACG7 


540 


15 


TTACGGGGAT 


GTCCTCATCA 


ACCACAAAGG 


CCGCGCTGA7 


GCGACCGAAG 


A7G7 AACCGC 


6O0 




GGTTGAACTC 


CATCCCGC7C 


ACCGCAACCG 


CCTAA7TTCA 


GGAGAACACC 


TAATTAAAGC 


660 




CTGGACACAT 


T7XCATT77C 


CGGGGCCCCG 


CAGCACATAC 


ACCCATTTTA 


AA7CCCAT7G 


720 


20 


GTACCATTTT 


CACCGAACCC 


A7TCGCACCA 


GTCCCGAAAG 


CTCAACCCCA 


TCTATAAG77 


780 




TCAAGGAAAG 


CC77GGGA7T 


GGGAAGTTTC 


CAA7GAAAAC 


GGCAACTA7G 


A77ATTTCA7 


840 




GTATGCCCAC 


A7CGA77A7G 


ACCA7CCTGA 


TGTCGCAGCA 


GAAATTAAGA 


GA7GGGGCAC 


900 


25 


TTGGTATGCC 


AA7GAAC7CZ 


AA7TGCACGC 


T77CCC7CTT 


CA7CC7C7CA 


AACACATTAA 


960 


A7777C7777 


T7GCGGCAT? 


GGGTTAATCA 


TG7CAGGGAA 


AAAACGGCGA 


AGCAAATC77 


1020 




TACGGTAGCT 


GAA7A77CG7 


ACAATGACT7 


GGCCGCGCTG 


GAAAAC7AT7 


TGAACAAAAC 


10SC 


30 


AAATTTTAAT 


CA77CAG7G7 


77GACGTGCC 


CC77CA77A7 


CAC77CCA7C 


C7CCATCCAC 


1140 


ACAGGGAGGC 


CGC7ATGA7A 


7GAGGAAATT 


GC7GAACGC7 


ACCG7CG7T7 


CCAAGCATCC 


1200 




CTTGAAATCG 


G77ACA777C 


7CGA7AACCA 


TGA7ACACAG 


CCGCGGCAA7 


CCCTTCAC7C 


12 60 




GACTGTCCAA 


ACA7GGT77A 


AGCCGC77CC 


TTACGC7777 


A77C7CACAA 


GGGAATC7CG 


1320 


35 


ATACCCTCAG 


G777TCTACC 


GGGATA7GTA 


CGGGACGAAA 


GGAGACTCCC 


AGCGCGAAA7 


1380 




TCCTGCCTTG 


AAACACAAAA 


77GAACCCA7 


CTTAAAAGCG 


AGAAAACAG7 


A7GCGTACGG 


1440 




AGCACAGCAT 


GAT7A777CC 


ACCACCATGA 


CAT7G7CGGC 


TGGACAAGGG 


AAGGCGACAG 


1500 


40 


CTCGGTTGCA 


AA77CAGG77 


7GGCGGCATT 


AATAACAGAC 


CGACCCCGTG 


GGGCAAAGCG 


1S60 




AATGTATGTC 


CCCCCCCAAA 


ACGCCGGTGA 


CACATCGCA7 


CACA77ACCG 


GAAACCGTTC 


1620 




GGACCCGGTT 


C7CA7CAA77 


CGCAAGGCTC 


GGGAGAG 7T7 


CACG7AAACG 


GCGGG7CGG7 


1630 


45 


TTCAATTTAT 


G77CAAAGA7 


AGAACACCAC 


AGAGCACGGA 


777CCTGAAG 


GAAA7CCC77 


1740 




77T7TAT777 


CCCCG7C77A 


7AAATTTCTT 


TGA77ACA77 


77A7AA7TAA 


7777AACAAA 


1300 




GTGTCATCAC 


CCC7CACCAA 


CGAC77CCTG 


ACAG77TCAA 


TCCCA7AGG7 


AACGCCGGGA 


1860 


50 


TGAAATCGCA 


ACC77ATC7C 


r. . C - " wCAAA 


GAAAGCAAA7 


G7C7CGAAAA 


7GACGGTA7C 


1920 




CCGGGTGATC 


AATCATCCTC 


A3ACTG7CAC 


GGA7GAA77G 


AAAAAGC7 




1963 
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(B) TYPE: arc i no acid 

(C) STRAND S2NESS: single 
(0) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO:32: 

Ala Asn Leu Asn Civ Thr Leu Mec Cin Tyr Phe Clu Trp Tyr Mec Pro 
1 5 * 10 15 

Asn Asp Gly Gin His Tro Lys Arg Leu Gin Aan Asp Ser Ala Tyr Leu 
20 " 2S 30 

« A i a Glu His Gly He Tnr Ala Val Trp He Pro Pro Ala Tyr Lys Cly 

35 40 45 

Thr Ser Gin Ala Asp Val Gly Tyr Gly Ala Tyr Asp Leu Tyr Asp Leu 
50 55 60 

20 civ Clu Phe His Clr. Lvs Gly Thr Val Arg Thr Lys Tyr Gly Thr Lys 

65* 75 80 

Gly Glu Leu Gin Ser Ala He Lvs Ser Leu His Ser Arg Asp He Asn 

85 " 90 95 



25 



Val Tvr Civ Asp Val Val He Asn His Lys Gly Cly Ala Asp Ala Thr 

100 105 110 

Glu Aso Val Thr Ala Val Glu Val Aso Pro Ala Asp Arg Asn Arg Val 

' 115 120 * 12S 

He Ser Cly Glu His Leu lie Lvs Ala Trp Thr His Phe His Phe Pro 

30 130 135 * 140 

Gly Arc Gly Ser Thr Tvr Ser Aso Phe Lvs Trp His Trp Tyr His Phe 

145 15C " ' 155 160 

Asp Civ Thr Asp Tro A3o Glu Ser Arg Lys Leu Asn Arg He Tyr Lys 

35 * 165 * 170 175 

Phe Gin Gly Lys Ala Tro Asp Tro Glu Val Ser Asn Clu Asn Gly Asr. 

180 " 135 190 
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Tyr Asp Tyr Leu He' Tvr Ala Aso He Asp Tyr Asp His Pro Asp Val 

195 * 200 205 

Ala Ala Clu He Lvs Arg Trp Gly Thr Trp Tyr Ala Asn Glu Leu Gin 

210 ' 215 220 

Leu aso civ Phe Ar; Leu Asp Ala Val Lys His He Lys Phe Ser Phe 

225 * * 23: 235 24C 

Leu Arg Aso Tro Val Asr. His Val Arg Glu Lys Thr Gly Lys Glu Me- 

" 24= 250 255 

Phe Thr Val Ala Glu Tvr Tro Cin Asn Asp Leu Gly Ala Leu Clu Asr. 

250 " " 265 270 

Tyr Leu Asr. Lvs Thr As- Phe Asn Hia Ser Val Phe Asp Val Pro Leu 

275 * 230 285 

Hls Tvr Cin Phe His Ala Ala Ser Thr Gin Gly Gly Gly Tyr Asp Met 

29C 295 30C 

Arg Lvs Leu Leu Asr Cly Thr Val Val Ser Lys His Pro Leu Lys Ser 
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305 310 315 320 

Val Thr Phe Val Asp Asn HLs Asp Thr G.\n Pro Cly Gin Ser Leu Glu 

325 3-0 335 

Ser Thr Val G lr. T.-.r Trp Phe Lys Pro Leu Ala Tyr Ala Phe lie Leu 
34C 345 350 

Thr Arg Clu Ser Civ Tyr Pro Gin Val Phe Tyr Giy Asp Met Tyr Cly 
3S5 360 365 

The Lys Gly Asc Ser Gin Arg Glu He Pro Ala Leu Lys HLs Lys lie 
370 * 375 380 

Clu Pro He Leu Lvs Ala Arg Lys Gin Tyr Ala Tyr Gly Ala Gin His 
385 * 390 395 400 

Asp Tyr Phe As? Ki9 His Asp He Val Gly Trp Thr Arg Glu Gly Asp 

405 410 415 

Ser Ser Val Ala Asn Ser Gly Leu Ala Ala Leu He Thr Asp Gly Pro 
420 425 430 

Gly Gly Ala Lys Arg Met Tyr Val Gly Arg Gin Asn Ala Gly Glu Thr 
435 " 440 445 

Trp His Asp He Thr Cly Asn Arg ser Glu Pro Val Val He Asn Ser 
450 455 4S0 

Glu Gly Trp Giy c.u Phe His Val Asn Cly Gly Ser Val Ser lie Tyr 
455 470 475 430 

Val Cln Arg 

(2) INFORMATION FOR SEQ 10 NO: 33: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 511 amino acids 
(8) TYPE: amino acid 
(C) 57RAN0EDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



40 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 33: 

Met Lys Clr. Gin Lys Arg Leu Tyr Ala Arg Leu Leu Thr Leu Leu Phe 
15 10 15 

Ala Leu He Phe Leu Leu Pro His Ser Ala Ala Ala Ala Ala Asn Leu 
Z - 25 30 

Asn Cly Zr.r Leu Met Cln Tyr Phe Glu Trp Tyr Met Pro Asn Asp Glv 
35 40 45 

His Trp Lvs Arg Leu Gin Asn Asd Ser Ala Tyr Leu Ala Glu Hi" Gly 
50 55 60 

He Thr Ala Val Trp He Pro Pro Ala Tyr Lys Cly Thr Ser Cln Ala 
65 70 75 30 

Asp Val G.y Tyr Gly Ala Tyr Asp Leu Tyr Asp Leu Gly Glu Phe His 

35 90 * 95 

GLr. Lys G.y Tr.r 7a L Arg Thr Lys Tyr Gly Thr Lys Gly Giu Leu Clr. 
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100 105 110 

Ser Ala lie Lvs se: Leu His Ser Arg Asp lie Asn Val Tyr Gly Asp 

115 * 120 125 

Val Val He Asn His Lvs Cly Gly Ala Aso Ala Thr Clu Asp Val Thr 

130 * 135 140 

Ala val Glu Val As? Pro Ala Aso Arg Asn Arg Val He Ser Gly Glu 

145 ISO 1S5 160 

His Leu He Lvs Ala Tro Thr His Phe His Phe Pro GLy Arg Gly Ser 

* 163 " 170 175 

Thr Tyr Ser Aso Phe Lvs Trp His Trp Tyr His Phe Asp Gly Thr Asp 

180 ' 185 190 

Trp Asp Glu Ser Arg Lys Leu Asn Arg He Tyr Lys Phe Gin Gly Lys 

195 200 205 

Ala Trp Asd Tro Glu Val Ser Asn Clu Asn Cly Asr. Tyr Asp Tyr Leu 

210 * * 215 220 

Met Tyr Ala Aso He Aso Tyr Aso His Pro Asp Val Ala Ala Glu He 

225 * 230 235 240 

Lys Arg Tro Civ Tr.r Tro Tvr Ala Asn Glu Leu Gin Leu Asp Gly Phe 

' 245 250 255 

Arc Leu Asp Ala Va: Lvs His He Lys Phe Ser Pne Leu Arg Asp Trp 

260 " 265 270 

Val Asn His Val Arg Glu Lvs Thr Gly Lvs Glu Mec Phe Thr Val Ala 

275 23C 23S 

Clu Tvr Trp Cln Asr. Asp Leu Gly Ala Leu Glu Asr. Tyr Leu Asn Lys 

290 * 295 3GC 



Thr Asn Phe Asn K:s Ser Val Phe Aso Val Pro Leu His Tyr Cln Phe 
305 310 315 320 

35 Hls Ala Ala Ser Tr.r Gin Gly Gly Cly Tyr As? Met Arg Lys Leu Leu 

325 330 335 

Asa Gly Thr Val Val Ser Lys Hls Pro Leu Lys Ser Val Thr Phe Val 
34C 345 350 

40 Asp Asn His Aso Tr.r Gin Pro Glv Cln Ser Leu Glu Ser Thr Val Gin 

355 * 360 365 

Thr Trp Phe Lvs Pro Leu Ala Tyr Ala Phe He Leu Thr Arg Glu Ser 
370 * 375 33C 

4S Gly Tyr Pro Gin Val Phe Tvr Gly Aso Met Tyr Gly Thr Lys Gly Asp 

385 390 395 " 400 

Ser Gin Arg Glu He Pro Ala Leu Lys His Lys lie Glu Pro He Leu 

405 410 415 



55 



Lvs Ala Arg Lvs Cln Tyr Ala Tyr Cly Ala Cln Kis Asp Tyr Phe Asp 

420 425 43G 

His His Aso lie val Glv Trp Thr Arg Glu Glv Asp Ser Ser Val Ala 

435 440 " 445 

Asn Ser Civ Leu Ala Aia Leu He Thr Asp Civ Pro Cly Cly Ala Lys 

450 " 45S " 46: 
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Arg Met Tyr Val Cly Arg Gin Aan 
465 470 

The Gly Asa Arg Ser CLu Pro Val 

425 

Clu Phe His Val Asn Gly Gly Ser 
500 

INFORMATION FOR SZQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 520 aitilno acids 

(B) TYPE: aaino acid 

(C) STRANDE0NESS: single 
• (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



All Cly Clu Thr Trp Hi3 Aso lie 
475 " 480 

Val He Asn Ser Clu Civ Trp Cly 
490 * 495 

Val Ser lie Tyr Val Glr, Arg 
505 510 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 

Met Arg Cly Ara Glv Asn Met lie Gin Lys Arg Lys Arg Thr Val Ser 

1 5 10 15 

Phe Arg Leu Val leu Met Cys Tnr Leu Leu Phe Val Ser Leu Pro He 
20 25 30 

Thr Lvs Tnr Ser Ala Val Asn Gly Thr Leu Met Gin Tyr Phe Clu Trp 
35 40 45 

Tyr Thr Pre Asr. A3p Gly Gin His Trp Lys Arg Leu Gin Asn Asp Ala 
50 " 55 60 

Glu His Leu Ser Asc lie Glv He Thr Ala Val Trp He Pro Pro Ala 
65 "70 75 80 

Tyr Lys Gly Leu Ser Gin Ser Asp Asn Gly Tyr Gly Pro Tyr Asp Leu 

£5 90 95 



Tyr Aso Leu G:v Clu Phe Gin Gin Lys Gly Thr Val Arg Thr Lys Tyr 

10C 105 110 

Gly Thr Lvs Ser Glu Leu Gin Asp Ala He Glv Ser Leu His Ser Arg 

1*15 120 * 125 

Asn Val Gin Val 7vr Gly Asp Val Val Leu Asn His Lys Ala Cly Ala 

130 135 140 

Asp Ala Thr Glu As? Val Thr Ala Val Glu Val Asn Pro Ala Asn Arg 

145 ISO 155 160 

Asn Gin Clu Thr Ser Clu Clu Tyr Cln He Lvs Ala Trp Tnr Aso Phe 

1 = 5 170 ' 175 

Arg Phe Pro Glv Arc Gly Asn Thr Tyr Ser Aso Phe Lys Trc His Tro 

13C 135 * 190 

Tyr His Phe Asc Civ Ala Asp Trp Aso Clu Ser Arg Lvs He Ser Arg 

195 " 200 205 

He Phe Lvs Phe Arc Gly Glu Cly Lvs Ala Tro Aso Tro Clu Val Ser 

21C * 215 ' 220 

Ser Clu Asr. Civ As- Tyr Asp Tyr Leu Met Tvr Ala Aso 7a* Asc Tyr 

225 * 230 * 2 3 5 2AZ 
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Asp H j.s Pro Ass Val Val Ala Glu The Lys Lys Trp Cly He Trp Tyr 
' 24S 250 255 

Ala Asn Clu Leu Ser Leu Asp Cly Phe Arg lie Asp Ala Aia Lys His 
5 25C 265 270 

lie Lys Phe Ser Phe Leu Arg Asp Trp Val Gin Ala Val Arg Gin Ala 
275 230 29S 

Thr Cly Lys Clu Mat Phe Thr Val Ala Glu Tyr Trp Gin Asp. Asn Ala 
10 290 " 295 300 

Glv Lys Leu Glu Asn Tvr Leu Asn Lys Thr Ser Phe Asn Gin Ser Val 
305 310 315 320 
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Phe Aso Val Pre Leu His Phe Asn Leu Gin Ala Ala Ser Ser Gin Cly 

325 330 335 

Gly Cly Tyr As? Met Arg Arg Leu Leu Asp Cly Thr val Val Ser Arg 
34C 345 3S0 

His Pro Glu Lvs Ala Val Thr Phe Val Glu Asn His Asp Thr Gin Pro 
355 " 360 365 

Cly Gin Ser Leu Clu Ser Thr VaL Gin Thr Trp Phe Lys Pro Leu Ala 
370 375 330 

Tyr Ala Phe He Leu Thr Arg Clu Ser Cly Tyr Pro Cln Val Phe Tyr 
335 390 395 400 

Glv Asd Met Tvr Civ Thr Lvs Cly Thr Ser Pro Lys Glu lie Pro Ser 

425 410 415 

Leu Lvs Asd Asr. !le Glu Pro lie Leu Lys Ala Arg Lys Glu Tyr Ala 
42C 425 430 

Tyr Gly Pro Glr. His Asa Tyr lie Asp His Pro As? Val He Cly Trp 
435 440 445 

Thr Arg Clu Gly As? Ser Ser Ala Aia Lys Ser Gly Leu Ala Ala Leu 
450 " 455 450 

He Thr Asn Glv ?ro Gly Gly Ser Lys Arg Met Tyr Ala Gly Leu Lys 
465 * 470 475 430 

Asn Ala Gly Glu Thr Trp Tyr Asp lie Thr Gly Asn Arg Ser Asp Thr 

435 490 495 

Val Lys He Glv Ser Asa Civ Tro Cly Clu Phe His Val Asn Asp Gly 
50C ' SOS 510 

Ser Val Ser lie Tvr Val Cln Lys 
515 * S20 

45 (2) INFORMATION F03 SZ1 10 NO: 35: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 543 a.xino acids 
(3) T V ?E: a.T.mo acid 
(C) STRA.N0E-NES S : single 
(0) TOPOLOG".': linear 
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(ii) MOLECULE TY?£: prorein 

(xl) SEQUENCE Z ZZ Z?. Z ?T lO.N : SEQ ID NO: 35 : 
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Val Leu Thr Phe His Arg lie He Arg Lys Cly Trp Met Phe Leu Leu 
1 5 10 15 

AU Phe Leu Leu Thr Ala Ser Leu Phe Cys Pro Thr Gly Arg His Ala 
20 25 30 

Lvs Ala Ala Ala Pro Phe Asn Gly Thr Met Met Gin Tyr Phe Glu Trp 
35 40 45 

Tyr Leu Pro As? Aso Gly Thr Leu Trp Thr Lys Val Ala Asn Glu Ala 
50 "55 60 

Asn Asn Leu Ser Ser Leu Cly lie Thr Ala Leu Ser Leu Pro Pro Ala 
6S 70 75 30 

Tyr Lys Cly Thr Ser Arg Ser Asp val Gly Tyr Gly Val Tyr Asp Leu 

33 90 9:3 

Tyr Asp Leu Glv GIj Phe Asn Gin Lys Cly Thr val Arg Thr Lys Tyr 
100 105 HO 

Glv Thr Lys Ala Gin Tyr Leu Gin Ala He Gin Ala Ala His Ala Ala 
7 115 120 125 

Glv Met Gin Val Tvr Ala Asp Val Val Phe Asp His Lys Gly Gly Ala 
' 130 ' US 140 

Aso Glv Thr Glu Tro Val Asp Ala Val Glu Val Asn Pro Ser Asp Arg 
145 ' " 150 155 160 

Asn Gin Glu He Ser Glv Thr Tyr Gin lie Cln Ala Trp Thr Lys Phe 

16= * 170 175 

Aso Phe Pro Glv Arg Cly Asn Thr Tyr Ser Ser Phe Lys Trp Arg Trp 
13C 135 190 

Tv His Phe Aso Glv Val Asp Trp Asp Glu Ser Arg Lys Leu Ser Arg 
195 " * 200 205 

lie Tyr Lvs Phe Arg Gly lie Gly Lys Ala Trp Asp Trp Glu Val As? 
210 " 215 220 

Thr Glu Asn Gly Asr. Tyr Asp Tyr Leu Met Tyr Ala Asp Leu Asp Met 
225 230 235 240 

Aso His Pro Glu Val Val Thr Glu Leu Lys Asn Trp Cly Lys Trp Tyr 

245 250 255 

Val Asn Thr Thr Asn He Asp Cly Phe Arg Leu Asp Gly Leu Lys His 
250 265 270 

He Lys Phe Ser Phe Phe Pro Asp Trp Leu Ser Tyr Val Arg Ser Gin 
275 280 285 

Th" Gly Lvs Pro Leu Phe Thr Val Gly Glu Tyr Tro Ser Tyr Asp He 
290 * 295 3C0 

Asn Lys Leu His Asr. Tvr lie Thr Lys Thr Asn Gly Thr Met Ser Leu 
305 310 315 320 

Phe Aso A La Pro Leu His Asn Lys Phe Tyr Thr Ala Ser Lys Ser Cly 

330 335 



"<25 



Glv Ala Phe Aso Met Arg Thr Leu Met Thr Asn Thr Leu Met Lys Asp 
3^2 345 3sC 



Cin Pro *hr Leu Ala Val Thr Phe Val. Asp Asn His Asp Thr Asn Pre 
353 35C 355 
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Ala Lys Arg Cvs Ser His Gly Arg Pro Trp Phe Lys Pro Leu Ala Tyc 
370 * 375 380 

Ala Phe He Leu Thr Arg Gin Glu Gly Tyr Pro Cys Val Phe Tyr Gly 
385 390 395 400 

Asp Tyr Tyc Glv lie Pro Gin Tyr Asn He Pro Ser Leu Lys Ser Lys 

425 410 415 

lie Asp Pro Leu Leu He Ala Arg Arg A3? Tyr Ala Tyr Gly Thr Gin 
42: 425 430 

His Asp Tyr Leu Asp His Ser Asp lie He Giy Trp Thr Arg Glu Gly 
435 * 440 445 

Val Thr Glu Lys Pro Gly Ser Gly Leu Ala Ala Leu He Thr Asp Giy 
450 455 460 

Ala Gly Arg Ser Lvs Trp Met Tyr Val Giy Lys Gin His Ala Gly Lys 
46S * 470 475 430 

Val Phe Tyr Aso Leu Thr Gly Asn Arg Ser Asp Thr Val Thr He Asn 

435 490 495 

Ser Asp Gly Trt Glv Glu Phe Lvs Val Asn Gly Gly Ser Val Ser Val 
SC: * ' 505 510 

Trp Val Pro Ar^ Lvs Thr Thr val Ser Thr lie Ala Arg Pro He Thr 
515 * 520 525 

Thr Arg Pro Trp Thr Gly Glu Phe Val Arg Trp His Glu Pro Arg Leu 
530 535 540 

Val Ala Trp Pre 
545 

(2) INFORMATION FC?. SEQ 10 NO; 36: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 433 amino acids 
(3) TYPE : a.r.ir.o acid 
35 (C) 5TRAN3E-NESS : single 

(0) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

40 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

Ala Asn Leu Asr. Civ Thr Leu Met Gin Tyr Phe Glu Trp Tyr Met Pro 
I 5 10 15 

45 Asr. A so Glv G . His Trp Lvs Arg Leu GLn Asn Asp Ser Ala Tyr Leu 

*2: 25 30 

Ala Glu His Civ He Thr Ala Val Trp He Pro Pro Ala Tyr Lys Gly 
35 40 45 

so Thr Ser Gin Ala Asp val Gly Tyr Gly Ala Tyr Asp Leu Tyr Asp Leu 

SO 5 5 60 

Glv Glu Phe His Glr. Lys Gly Thr Val Arg Thr Lys Tyr Gly Thr Lys 
65" 70 * 75 30 
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Cly CLu Lsu C.r. S2: Ala lie Lys Ser Leu His Ser Arg Asp lie Asn 

zl ' 90 95 
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Val Tyr Cly Asp Val Val He Asn His Lys Cly CLv Ala Aso Ala Thr 
100 105 110 

GLu Asp Val Thr Ala Val Glu Val Asp Pro Ala Asp Arg Asn Arg Val 
5 115 120 125 

lie Ser Gly Glu His Leu lie Lys Ala Trp Thr His Phe His Phe Pro 
130 135 140 

Cly Arg Gly Ser Thr Tyr Ser Asp Phe Lys Trp His Tro Tvr His Phe 
10 145 150 . 155 160 

Asp Gly Thr Asp Trp Asp Glu Ser Arg Lye Leu Asn Arg lie Tyr Lys 

153 170 175 

Phe Gin Gly Lys Ala Trp Asp Trp Glu Val Ser Asn Glu Asn Civ Asr. 
15 190 185 190 

Tyr Asp Tyr Leu Thr Tyr Ala Asp lie Asp Tyr A9p His Pro Aso Val 
195 200 205 

Ala Ala Glu lie Lys Arg Trp Gly Thr Trp Tyr Ala Asn GLu Leu Gin 
20 210 215 220 

Leu Asp Giy Phe Arg Leu Asp Ala Val Lys His He Lvs Phe Ser Phe 
225 230 235 " 240 
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Leu Arg Asp Trp Vai Asn His Val Arg Glu Lys Thr Gly Lvs Ciu Met 

245 250 ' 255 

Phe Thr Val Ala Glu Tyr Tro Gin Asn Asp Leu Gly Ala Leu Glu Asn 
250 * 255 270 

Tyr Leu Asn Lys Thr Asn Phe Asn His Ser Val Phe Aso Val Pro Leu 
275 230 235 

His Tyr Gin Phe His Ala Ala Ser Thr Gin Gly Cly Gly Tvr Aso Met 
290 295 300 

Arg Lys Leu Leu Asn GLy Thr Val Val Ser Lvs His Pro Leu Lvs Ser 
305 310 315 * 32: 

Val Thr Phe Val As? Asn His As? Thr Gin Pro Gly Gin Ser Leu Glu 

325 330 335 

Ser Thr val Gin Thr Trp Phe Lys Pro Leu Ala Tyr Ala Phe He Leu 
340 345 350 

Thr Arg Glu Ser Cly Tyr Pro Cln Val Phe Tyr Glv Asp Met Tyr Glv 
355 360 * 3S5 

Thr Lys Gly As? Ser Gin Arg Glu He Pro Ala Leu Lys His Lys lie 
370 375 330 

Glu Pro He Leu Lys Ala Arg Lys Gin Tyr Ala Tvr Civ Ala Gin Hls 
335 390 395 * * 40C 

Asp Tyr Phe As? His His Asp He Val Gly Ttd Thr Arg Glu Glv Asc 

405 410 " 415 

Ser Ser Val Ala Asr. Ser Gly Leu Ala Ala Leu He Thr Aso Glv Pre 
42C 425 43C 

Cly Cly Ala Lvs Arr Met Tvr Val Civ Arg Gin Asr. Ala Glv Clu Thr 
435 440 445 

Trp His Asp He Tr.r Giy Asn Arg Ser Clu Pro Val Val He Asr. Ser 
45C 455 44C 
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CLu Cly Tcp Cly Clu Phe His Va 1 Asn Cly Cly Ser Val Ser lie Tyr 
465 470 475 430 

Val Gin Arg 



INFORMATION FOR SEQ 10 NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 437 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 
( 0 ) TOPOLOGY: linear 

(il) MOLECULE TYPE: procein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37; 

Ala Ala Ala Ala Ala Asn Leu Asn Cly Thr Leu Met Gin Tyr Phe Clu 
15 10 15 

Trp Tyr Met Pro Asn Aso Gly Gin His Trp Lys Arg Leu Gin Asn Aso 
20 25 30 

Ser Ala Tyr Le- Ala Clu His Gly lie Thr Ala Val Trp He Pro Pro 
35 40 45 

Ala Tyr Lvs Civ Tr.r Ser Gin Ala Aso Val Civ Tyr Civ Ala Tvr Aso 
50 5S 60 

Leu Tyr Asp Leu Gly Clu Phe His Gin Lys Gly Thr Val Arg Thr Lys 
65 70 75 80 

Tyr Gly Thr Lys Gly Glu Leu Gin Ser Ala He Lvs Ser Leu His Ser 

35 90 * 95 

Arg Asp lie Asr. Val Tyr Gly Aso Val Val lie Asn His Lvs Glv Gly 

:;c 105 1I0 

Ala Asd Ala T~r Clu Asp Val Thr Ala Val Clu Val Aso Pro Ala Asp 
115 120 125 

Arg Asn Arg val lie Ser Cly Clu His Leu He Lvs Ala Tro Thr His 
130 135 140 

Phe His Phe Pro Gly Arg Gly Ser Thr Tyr Ser Asd Phe Lys Trp His 
145 150 155 " 160 

Trp Tyr His ?r.e Asp Gly Thr Asp Trp Asp Glu Ser Arg Lys Leu Asn 

155 170 175 

Arg He Tyr Lys Phe Gin Gly Lys Ala Trp Asp Trp Glu Val Ser Asn 
13C 135 * " 190 

Glu Asn Gly Asr. Tyr Asp Tyr Leu Mez Tyr Ala Aso He Asp Tyr Aso 
195 200 * 205 

His Pro Asp Val Ala Ala Glu He Lvs Arg Tro Gly Thr Trp Tyr Ala 
210 215 " " 220 

Asn Glu Leu Clr. Leu Asp Cly Phe Arg Leu Aso Ala Vai Lys His He 
225 230 235 ' 240 

Lys Phe Ser ?-s Leu Arg Asp Trp Val Asr. His Vai Arg Clu Lvs Thr 

245 25C 25; 
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Gly Lys Ciu Mei Pre Thr Val Ala Ciu Tyr Trp Cln Asn Asp Leu Cly 

265 270 

Ala Leu CLu Asr. Tyr Leu Asn Lys Thr Asn Phe Asn His Ser Val Phe 
S 275 230 285 

Asp Val Pro Leu His Tyr Cln Phe His Ala Ala Ser Thr Cln Cly Gly 
290 295 300 

Cly Tyr Asp Hez Arg Lys Leu Leu Asn Cly Thr Val Val Ser Lys His 
10 305 310 315 320 

Pro Leu Lys Ser Val Thr Phe Val Asp Asn His Asp Thr Cln Pro Cly 

325 330 335 

Cln Ser Leu civ Ser Thr Val Cln Thr Trp Phe Lvs Pro Leu Ala Tyr 
rs 3^: 345 ' 350 

Ala Phe He Leu Thr Arg Glu Ser Cly Tyr Pro Cln Val Phe Tvr Gly 
355 360 365 
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Asp Met Tyr Cly Tnr Lys Cly Asp Ser Gin Arg Glu He Pro Ala Leu 
370 375 380 

Lys His Lys He Clu Pro lie Leu Lys Ala Arg Lys Cln Tyr Ala Tyr 
335 390 395 400 

Ciy Ala Cln K.s Asp Tyr Phe Asp His His Aso He Val Glv Trp Thr 

405 410 * * 415 

Arg Clu Gly Asp Ser Ser Val Ala Asn Ser Gly Leu Ala Ala Leu He 
4:: 425 ' 430 

Thr Asp Cly ?r: Ciy Cly Ala Lys Arg Met Tvr Val Gly Arg Cln Asn 
43S 440 * 445 

Ala Cly Glu Tr.r Trp His Asp He Thr Cly Asn Arg Ser Glu Pro Val 
450 455 46G 

Val He Asp. Ser Ciu Cly Trp Cly Ciu Phe His Val Asn Cly Cly Ser 
465 470 475 480 

Val Ser He Tyr Val Cln Arg 

435 

(2) INFORMATION FO?- SZQ 10 NO: 39: 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 2 amino acids 
(3) TYPE: ammo acid 

(C) STRA:OE3NESS: single 

(D) T0P0LCCV: linear 

(ii) MOLECULE T??E: oroteLr, 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 38 : 

Mer Lys Gin Clr. Lys Arg Leu Thr Ala Arg Leu Leu Thr Leu Leu Phe 
50 1 5 10 15 

Ala Leu He Pr.e Leu Leu Pro Hls Ser Ala Ala Ala Ala Ala Asn Leu 
2Z 25 30 

55 (2) INFORMATION FC= 5EQ 10 NO: 39: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 33 amino acids 
( 3 ) TYPE: ammo acid 
(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



Met Arg Ser Lvg Thr Leu Trp lie Ser Leu Leu Phe Ala Leu Thr Leu 
1*5 10 15 

lie Phe Thr Met Ala Phe Ser Asn Met Ser Ala Gin Ala Ala Gly Lys 
15 20 25 30 

Ser 



(2) INFORMATION FOK SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: a.nmo acid 

(C) STRANDEDNESS; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 40: 

30 Met Arg Ser Lys Thr Leu Trp lie Ser Leu Leu Phe Ala Leu Thr Leu 

1*5 10 15 

lie Phe Thr Met Ala Phe Ser Asn Met Ser Ala Gin Ala Ala Ala Ala 
20 25 30 



Ala Ala Asn 

35 

(2) INFORMATION FOR SEQ ID NO: 41: 

|i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 ammo acids 
(BJ TYPE: ammo acid 

(C) STRANDE3NESS: single 

(D) TOPOLOGY: linear 

(ii.) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 

Met Arg Ser Lys Tr.r Leu Trp lie Ser Leu Leu Phe A La Lgu Thr Leu 
15 10 15 

lie Phe Thr Met Ala Phe Ser Asn Met Ser Ala Gin Ala Ala Asn Leu 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 42: 

55 ( l ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 



35 



EP 0 867 504 A1 



(B) TYPE ; nucleic acid 

(C) STRAN0E2NESS: single 
(0) TOPOLOGY: Linear 

{Li) MOLECULE TYPE : DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

CACCTAATTA AACCTTTCAC ACATTTTCAT TTT 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 33 base pairs 
( 3 ) TYPE: r.ucleic acid 
(C) STRANDSDNESS: single 
(DJ TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
CACCTAATTA AAG CTTACAC ACATTTTCAT TTT 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 base pairs 

( B ) TYPE: nucleic acid 
<C) STRANSEDNESS: single 
(D ) TOPCLCGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 
CCGCGTAATT TCCGCACAAC ACCTAATTAA AGCCGCAACA CATTTTCATT TTCCCGGGCG 
CGGCAG 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 2 base pairs 
(Bl TYPE: nucleic acid 
(C) STRANDEDNESS: single 
( o ) TOPOLOGY; linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 45: 

CCGGAGAACA CCTAATTAAA GCCCTAACAC ATTTTCATTT TC 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 2 base paLrs 
( 3 ) TYPE : r.uc ie acid 
(C> st?^n:e^ness: single 
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(0) TOPOLOGY: linear 
(ii> MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 

CCGGAGAACA CCTAATTAAA CCCCACACAC ATTTTCATTT TC 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 42 base pairs 
<B) TYPE: nucleic acid 
(C) STRAN'-JSDNESS: Single 
( 0 ) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

CCGGAGAACA CCTAATTAAA GCCTCCACAC ATTTTCATTT TC 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
(3) TYPE: nucleic acid 
(C) STRASCEDNESS: single 
( 0 ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: Off A (genomic J 



(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

GATGCACTAT TTCGAACTCC TATA 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base pairs 
(8) TYPE: nucleic acid 

(C) STRAN2EDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49 : 

TGCCCAATGA TCCCCAACAT TGGAAG 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 24 base pairs 
(3) TYPE: nucleic acid 
(C) STP-AN-EDNESS: single 
(0) TGrCLCCY : linear 

( l i. J MOLECULE TYPE: ONA (genomic) 
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(xi) SZQUZNCZ DESCRIPTION: SEQ 10 NO:50; 

CGAATGGTAT GCTCCCAATC ACGG 

(2) INFORMATION F03 SEQ ID NO: 51: 

<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
(BJ TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: SI: 

CGAATGGTAT CCCCCCAATG ACGG 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
(8) TYPE : nucleic acid 
{C) STRAND ED NESS : single 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO:52: 
CGAATGGTAT AATCCCAATG ACGG 
(2) INFORMATION FOR SEQ ID NO: 53: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ill MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53 
CGAATGGTAT GATCCCAATG ACGG 
(2) INFORMATION FOR SEQ ID NO:S4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRAJTEDNESS : single 
( 0 ) TOPOLCCY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 



( x . i SEQUENCE INSCRIPTION : SEQ ID NO: 54 
CCAATCGTAT CACCCCAAT3 ACGG 
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(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH : 24 base pairs 
{ 3 ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

CGAATCGTAT AAACCCAATG ACGC 2 4 

15 (2) INFORMATION FOR SEQ ID NO:S6: 

<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 24 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOCY: linear 

20 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO:56: 

25 

CGAATCGTAT CCGCCCAA7C ACCC 24 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS : 
30 (A) LENCTH : 2 4 base pairs 

(3) TYPE: nucleic acid 

(C) S THAN 2 ED NESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
CGAATGGTAT TCTCCCAATG ACGC 24 
(2) INFORMATION FOR SEQ ID NO: S3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANTEDNESS : Single 

(D) TOPOLOGY: linear 

( i i. ) MOLECULE TYPE ; DNA ( genomic ) 

(XL) SEQUENCE DESCRIPTION: SEQ ID NO:53: 

CGAATGGTAC ACTCCCAA7C ACCC 24 

(2) INFORMATION FOP SEQ ID NO: 59: 

Ul SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 2 4 base pairs 
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{ 3 ) TYPE: r.ucieic acid 

(C) STRAND £2 NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59 

CGAATGGTAT GTTCCCAATG ACCG 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
(8) TYPE : nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 60 

CGAATGCTAT TGTCCCAATC ACGG 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
(3) TYPE : nucleic acid 

(C) STRANDE3SESS : single 

(D) TOPOLOCY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 61 
CGAATGCTAT CAACCCAATG ACGC 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDE0NS55 : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62 

CCAATGGTAT CAACCCAATG ACGG 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
( B ) TYPE: nude -c acid 
(C) STRAN-ETNESS : single 
(D> TOPOLOC*.': linear 
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(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ 10 NO: 63 
CCAATGCTAT GCTCCCAATC ACCG 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

( B ) TYPE: nucleic acid 

{ C ) STRANDEDNESS : a ingle 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE; DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:64 

CGAATGGTAT ATTCCCAATG ACGG 

(2) INFORMATION FOR SEQ ID NO: 65: 

( i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
{ B ) TYPE: nucleic acid 

(C) STRANDSDNESS: single 

( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6S 

CGAATGGTAT TTTCCCAATG ACGG 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
( 3 ) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66 
CGAATGGTAC TGGCCCAATG ACGG 
(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDE3NESS : single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA (genomic) 
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(Xi) SEQUENCE DESCRIPTION : SEQ ID NO:67: 

CGAATCGTAT TATCCCAATG ACGC 24 

5 (2) INFORMATION FOR SEQ ID NO: 68: 

<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 24 base pairs 
{ B) TYPE: nucleic acid 
(C) STRANDED MISS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: 
CCGTCATTGG GACTACGTAC CATT 24 



20 Claims 

1 . A mutant alpha-amylase that is the expression product of a mutated DNA sequence encoding an alpha-amylase, 
the mutated DNA sequence being derived from a precursor alpha-amylase which is a Bacillus alpha- amylase by 
substitution or deletion of an aminoacid at the position equivalent to M+1 5 in B. licheniformis alpha-amylase, with 

25 the proviso that the substituent amino acid is not Leu, lie, Asn, Ser, Gin, Asp or Glu. 

2. A mutant alpha-amylase of claim 1 further comprising one or more other site specific mutations. 

3. A mutant alpha-amylase of any preceding claim wherein the precursor is from a Bacillus selected from the group 
30 b. licheniformis, B, 

stearothermophilus and 
B. amyloliquefaciens. 

4. A mutant alpha-amylase of claim 3 wherein the precursor is Bacillus licheniformis alpha-amylase. 

35 

5. DNA encoding a mutant alpha-amylase of any one of claims 1 to 4. 

6. Expression vectors encoding the DNA of claim 5. 

40 7. Host cells transformed with the expression vector of claim 6. 

8. A detergent composition comprising a mutant alpha-amylase of any one of claims 1 to 4 

9. A detergent composition of claim 8 which is a liquid, gel or granular composition. 

45 

10. A detergent composition of claim 8 or claim 9 further comprising one or more additional enzymes. 

11. A starch liquefying composition comprising a mutant alpha-amylase of any one of claims 1 to 4. 

50 12. A detergent composition which comprises a mutant alpha-amylase and one or more additional enzymes where- 

in said mutant alpha-amylase is the expression product of a mutated DNA sequence encoding an alpha-amylase, 
the mutated DNA sequence being derived from a precursor alpha-amylase which is a Bacillus alpha-amylase by 
substitution or deletion of an amino acid at the position equivalent to M+1 5 in 
S. licheniformis alpha-amylase. 

55 

13. The detergent composition of claim 12 wherein said mutant alpha-amylase is M15L. 

14. The detergent composition of claim 1 2 or claim 13 wherein said mutant alpha-amylase comprises on or more 
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other site specific mutations. 

16. A detergent composition as claimed in any one of claims 1 3 to 1 6 wherein said additional enzyme or enzymes 
is selected from the group consisting of amylases, proteases, lipases and cellulases. 

17. A method of liquefying a granular starch slurry from either a wet or dry milling process at a pH of from about 
4 to about 6 comprising: 

(a) adding an effective amount of an alpha-amylase mutant to the slurry; 

(b) optionally adding an effective amount of an antioxidant to the slurry; and 

(c) reacting the slurry for an appropriate time and at an appropriate temperature to liquefy the starch; 

wherein said alpha-amylase mutant is the expression product of a mutated DNA sequence encoding an alpha- 
amylase, the mutated DNA sequence being derived from a precursor alpha-amylase which is a Bacillus alpha- 
amylase by substitution or deletion of an amino acid at the position equivalent to M+15 in B.licheniformis alpha- 
amylase. 

18. A starch liquefying composition which comprises a mutant alpha-amylase wherein said mutant is the expression 
product of a mutated DNA sequence encoding an alpha-amylase, the mutated DNA sequence being derived from 
a precursor alpha-amylase which is a Bacillus alpha-amylase, by substitution or deletion of an amino acid at the 
position equivalent to M+15 in B. licheniformis alpha-amylase. 

19. The starch liquefying composition of claim 18 wherein said mutant alpha-amylase is M15L 
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10 30 50 

AGCTTGAAGAAGTGAAGAAGCAGAGAGGCTATTGAATAAATGAGTAGAAAGCGCCATATC 

70 90 no 

GGCGCTTTTCTTTTGGAAGAAAATATAGGGAAAATGGTACTTGTTAAAAATTCGGAATAT 

130 150 170 

TTATACAACATCATATGTTTCACATTGAAAGGGGAGGAGAATCATGAAACAACAAAAACG 

M K Q O K R 

190 210 230 

GCTTTACGCCCGATTGCTGACGCTGTTATTTGCGCTCATCTTCTTGCTGCCTCATTCTGC 
LYARLLTLLFAL I FLLPHSA 

250 270 290 

AGCAGCGGCGGCAAATCTTAATGGGACGCTGATGCAGTATTTTGAATGGTACATGCCCAA 
A A A A 'N LNG T LMOYFEWY MPN 

310 330 350 

TGACGGCCAACATTGGAAGCGTTTGCAAAACGACTCGGCATATTTGGCTGAACACGGTAT 
DGQHWKR LQNDS A Y L A E H G I 

370 390 410 

TACTGCCGTCTGGA7TCCCCCGGCATATAAGGGAACGAGCCAAGCGGATGTGGGCTACGG 
TAVW IP P AYKGTS OA DVGYG 

430 450 470 

tgcttacgacctttatgat7taggggagtttcatcaaaaagggacggttcggacaaagta 
aydlydlg e fhqkg tvr tky 

490 510 530 

cggcacaaaaggagagctgcaatctgcgatcaaaagtcttcattcccgcgacattaacgt 
gtkgelosa ikslhsrdinv 

550 570 590 

ttacggggatgtggtcatcaaccacaaaggcggcgctgatgcgaccgaagatgtaaccgc 
ygdvv inhkggada te dvta 

610 630 650 

ggttgaagtcgatcccgctgaccgcaaccgcgtaatttcaggagaacacctaattaaagc 
vevdpadrnrvisgehlika 

67(3 690 710 

ctggacacattttcattttccggggcgcggcagcacatacagcgattttaaatggcattg 
wthfhfpg rgsty3dfkwhw 

73G 750 770 

gtaccattttgacggaaccgattgggacgagtcccgaaagctgaaccgcatctataagtt 
yhfdgtdwdesrklnr iykf 

79C- 810 830 

tcaaggaaaggcttgggattgggaagtttccaatgaaaacggcaactatgattatttgat 

Q G K A W DWE-VSNENG.N^OYLM 



FIG..1A 
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850 870 890 

GTATGCCGACATCGATTATGACCATCCTGATGTCGCAGCAGAAATTAAGAGATGGGGCAC 
YAOI OYDHPDVAAE IKRWGT 

910 930 950 

TTGGTATGCCAATGAACTGCAATTGGACGGTTTCCGTCTTGATGCTGTCAAACACATTAA 
WYANE L QL DGFRLDA VKH IK 

970 990 1010 

ATTTTCTTTTTTGCGGGATTGGGTTAATCATGTCAGGGAAAAAACGGGGAAGGAAATGTT 
FSFLR DWVNHVR EKTG KEMF 

1030 1050 1070 

TACGGTAGCTGAATATTGGCAGAATGACTTGGGCGCTCTGGAAAACTATTTGAACAAAAC 
T VAEYWONDLG A lENYLNKT 

1090 m0 1130 

AAATTTTAATCATTCAGTGTTTGACGTGCCGCTTCATTATCAGTTCCATGCTGCATCGAC 
NFNHSV FD VP LHYOFHAAST 

1150 H70 1190 

ACAGGGAGGCGGCTATGATATGAGGAAATTGCTGAACGGTACGGTCGTTTCCAAGCATCC 
OG G G YD MR KLLNGT VVSKHP 

1210 1230 1250 

GTTGAAATCGGTTACAT7TGTCGATAACCATGATACACAGCCGGGGCAATCGCTTGAGTC 
LKS VTFV DNHDTQPGOS LES 

1270 1290 1310 

GACTGTCCAAACATGGTTTAAGCCGCTTGCTTACGCTTTTATTCTCACAAGGGAATCTGG 
TVOTV.'FKP LAYAFILTRESG 

1330 1350 1370 

ATACCCTCAGGTTTTCTACGGGGATATGTACGGGACGAAAGGAGACTCCCAGCGCGAAAT 
YPOVFYG DMYGTKGDSOREI 

1390 1410 14 30 

TCCTGCCTTGAAACACAAAATTGAACCGATCTTAAAACGCAGAAAACAGTATGCGTACGG 
PALKHKIEP I LKARKQYAYG 

1 450 1 470 1 490 

AGCACAGCATGATTATTTCGACCACCATGACATTGTCGGCTGGACAAGGGAAGGCGACAG 
AO H D Y ? DHHD IVGWTR EGDS 

1510 1530 1550 

CTCGGTTGCAAATTCAGGTTTGGCGGCATTAATAACAGACGGACCCGGTGGGGCAAAGCG 
SVANSGL A ALITDGPGG AKR 

1570 1590 1610 

AATGTATGTCGGCCGGCAAAACGCCGGTGAGACATGGCATGACATTACCGGAAACCGTTC 
MYVG R O N A G E TWHDITGNRS 

1630 165G 1670 

GGAGCCGGTTGTCATC--TTCGGAAGGCTGGGGAGAGTTTCACGTAAACGGCGGGTCGGT 
E P V V I tl S EG W G E F H V N G G S V 

FIG. IB 
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1690 1710 1730 

TTCAATTTATGTTCAAAGATAGAAGAGCAGAGAGGACGGATTTCCTGAAGGAAATCCGTT 
S I Y V Q R • 

1750 1770 1790 

TTTTTATTTTGCC CG TC TTATA AATTTCTTTG ATTAC ATTTTATAATTA ATTTTA AC AAA 



1810 1830 1850 

GTGTCATCAGCCCTCAGGAAGGACTTGCTGACAGTTTGAATCGCATAGGTAAGGCGGGGA 



1870 1890 1910 

TGAAATGGCAACGTTATCTGATGTAGCAAAGAAAGCAAATGTGTCGAAAATGACGGTATC 

1930 1950 
GCGGGTGATCAATCATCCTGAGACTGTGACGGATGAATTGAAAAAGCT 



F/G._yc 



r 
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FIG.. 1B 



FIG..1 
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10 30 50 

ANLNGTLMQYFE^A'MPNOGOHWKRLQNDSAYLAEHGITAVVVIPPAYKGTSOADVGYGAYD 

70 90 no 

LYDLGEFHQKGTVRTKYGTKGELQSAIKSLHSRDINVYGDVVINHKGGADATEDVTAVEV 

130 150 170 

DPADRNRVISGEHLIKAWTHFHFPGRGSTYSDFKWHWYHFDGTDWDESRKLNRIYKFQGK 

190 210 230 

AWDWEVSNENGNYDYLMYADIDYDHPDVAAEIKRWGTWYANELQLDGFRLDAVKHIKFSF 

250 270 290 

LRDWVNHVREKTGKEMFTVAEYWONDLGALENYLNKTNFNHSVFDVPLHYOFHAASTQGG 

310 330 350 

GYDMRKLLNGTVV5KHPLKSVTFVDNHDTQPGOSLE5TVOTWFKPLAYAFILTRESGYPO 

370 390 410 

VFYGDMYGTKGDSQREIPALKHKIEPILKARKOYAYGAOHDYFDHHDIVGWTREGDSSVA 

430 450 470 

NSGLAALITDGPGGAKRMYVGROIMAGETWHDITGNRSEPVVINSEGWGEFHVNGGSVSIY 

VQR 

F/G._2 
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10 30 50 

ANLNGTLMQYFEWYMPNDGOHWKRLQNDSAYLAEHGITAVWIPPAYKGTSQADVGYGAYD 

70 90 110 

LYDLGEFHQKGTVRTKYGTKGELOSAIKSLHSRDINVYGDVVINIHKGGADATEDVTAVEV 

130 150 170 

DPADRNRVISGEHLIKAWTHFHFPGRGSTYSDFKWHWYHFDGTDWDESRKLNRIYKFQGK 



190 210 230 

AWDWEVSNENGNYDYLTYADIDYDHPDVAAEIKRWGTWYANELQLDGFRLDAVKHIKFSF 

250 270 290 

LRDWVNHVREKTGKEMFTVAEYWONDLGALENYLNKTNFNHSVFDVPLHYQFHAASTQGG 

310 330 350 

GYDMRKLLNGTVVSKHPLKSVTFVDNHDTQPGQSLESTVOTWFKPLAYAFILTRESGYPO 

370 390 410 

VFYGDMYGTKGDSOREIPALKHKIEPILKARKOYAYGAOHDYFDHHDIVGWTREGDSSVA 

430 450 470 

NSGLAALITDGPGGAKRMYVGRONAGETWHDITGNRSEPVVINSEGWGEFHVNGGSVSIY 



VOR 

FIG.. 4a 
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AAAA 

14 34 54 

ANLNGTLMOYFEWYMPNDGOHWKRLQNDSAYLAEHGITAVWIPPAYKGTSQADVGYGAYD 



74 94 114 

LYDLGEFHOKGTVRTKYGTKGELOSAIKSLHSRDINVYGDVVINHKGGADATEDVTAVEV 



134 154 174 

DPADRNRVISGEHLIKAWTHFHFPGRGSTYSDFKWHWYHFDGTDWDESRKLNRIYKFQGK 
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SIGNAL SEQUENCE ■ MATURE PROTEIN JUNCTIONS IN: 

B.Iicheniformis alpha-amylase . (Pstl) 
MKQOKRLTARLLTLLFALI FLLPHSa'aAa|aNL 

N-terminus 

B.subtilis alkaline protease aprE. (Pstl) 

MRSKTL WISLLFALTLIFTMAFSNMSAQA^G K S 

N-terminus 

BJicheniformts alpha-amylase in pA4BL (Pstl) 
MRSKTLWISLLFAL T L I F T M A F S N M S A O A^A A A AN. 

N-terrninus 

BJichenftormis alpha-amylase in pBLapr 

MRSKTLWISLLFALTLIFTMAFSNMSAOA^N L 

N-terminus 

(Pstl) indicates the site of the restriction site in the gene. 

N-terminus indicates cleavage site between signal peptide and secreted prote 
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